date:20230713

[PATCH] target/i386: Check CR0.TS before enter_mmx

2023-07-13 Thread Matt Borgerson

When CR0.TS=1, execution of x87 FPU, MMX, and some SSE instructions will
cause a Device Not Available (DNA) exception (#NM). System software uses
this exception event to lazily context switch FPU state.

Before this patch, enter_mmx helpers may be generated just before #NM
generation, prematurely resetting FPU state before the guest has a
chance to save it.

Signed-off-by: Matt Borgerson 
---
 target/i386/tcg/decode-new.c.inc | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 46afd9960b..0ead1c6011 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -1803,16 +1803,19 @@ static void disas_insn_new(DisasContext *s,
CPUState *cpu, int b)
 }
 break;

-case X86_SPECIAL_MMX:
-if (!(s->prefix & (PREFIX_REPZ | PREFIX_REPNZ | PREFIX_DATA))) {
-gen_helper_enter_mmx(cpu_env);
-}
+default:
 break;
 }

 if (!validate_vex(s, &decode)) {
 return;
 }
+
+if (decode.e.special == X86_SPECIAL_MMX &&
+!(s->prefix & (PREFIX_REPZ | PREFIX_REPNZ | PREFIX_DATA))) {
+gen_helper_enter_mmx(cpu_env);
+}
+
 if (decode.op[0].has_ea || decode.op[1].has_ea || decode.op[2].has_ea) {
 gen_load_ea(s, &decode.mem, decode.e.vex_class == 12);
 }
-- 
2.34.1

Re: [PATCH v7 1/9] migration: introduced 'MigrateAddress' in QAPI for migration wire protocol.

2023-07-13 Thread Het Gala




On 12/07/23 6:25 pm, Markus Armbruster wrote:

The subject

 migration: introduced 'MigrateAddress' in QAPI for migration wire protocol.

is rather long.  Try to limit subjects to about 60 characters.  Easily
done here:

 migration: New QAPI type 'MigrateAddress'

Ack. Thanks for the suggestion Markus.

Het Gala  writes:


This patch introduces well defined MigrateAddress struct and its related
child objects.

The existing argument of 'migrate' and 'migrate-incoming' QAPI - 'uri'
is of string type. The current migration flow follows double encoding
scheme for  fetching migration parameters such as 'uri' and this is
not an ideal design.

Motive for intoducing struct level design is to prevent double encoding
of QAPI arguments, as Qemu should be able to directly use the QAPI
arguments without any level of encoding.

Suggest to mention that this commit only adds the type, and actual uses
come in later commits.

Ack.

Suggested-by: Aravind Retnakaran 
Signed-off-by: Het Gala 
Reviewed-by: Juan Quintela 
Reviewed-by: Daniel P. Berrangé 
---
  qapi/migration.json | 41 +
  1 file changed, 41 insertions(+)

diff --git a/qapi/migration.json b/qapi/migration.json
index 47dfef0278..b583642c2d 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -1417,6 +1417,47 @@
  ##
  { 'command': 'migrate-continue', 'data': {'state': 'MigrationStatus'} }
  
+##

+# @MigrationAddressType:
+#
+# The migration stream transport mechanisms.
+#
+# @socket: Migrate via socket.
+#
+# @exec: Direct the migration stream to another process.
+#
+# @rdma: Migrate via RDMA.
+#
+# Since 8.1
+##
+{ 'enum': 'MigrationAddressType',
+  'data': ['socket', 'exec', 'rdma'] }
+
+##
+# @MigrationExecCommand:
+#
+# @args: command (list head) and arguments to execute.
+#
+# Since 8.1
+##
+{ 'struct': 'MigrationExecCommand',
+  'data': {'args': [ 'str' ] } }
+
+##
+# @MigrationAddress:
+#
+# Migration endpoint configuration.
+#
+# Since 8.1
+##
+{ 'union': 'MigrationAddress',
+  'base': { 'transport' : 'MigrationAddressType'},
+  'discriminator': 'transport',
+  'data': {
+'socket': 'SocketAddress',
+'exec': 'MigrationExecCommand',
+'rdma': 'InetSocketAddress' } }
+
  ##
  # @migrate:
  #

The issues I'm pointing out don't justfy yet another respin.  But if you
need to respin the series for some other reason, plase take care of them.
Ack. I think from QAPI side, the patches look good. But from code 
implementation side, I haven't received acked-by or reviewd-by from the 
maintainers. So would anyway need to respin the series I think. Cc'ing 
Daniel, Peter, Juan and other migration maintainers for reviews on other 
patches too.

Acked-by: Markus Armbruster 


Regards,
Het Gala

[PATCH v5 2/5] target/riscv: check the in-kernel irqchip support

2023-07-13 Thread Yong-Xuan Wang

We check the in-kernel irqchip support when using KVM acceleration.

Signed-off-by: Yong-Xuan Wang 
Reviewed-by: Jim Shu 
Reviewed-by: Daniel Henrique Barboza 
---
 target/riscv/kvm.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/target/riscv/kvm.c b/target/riscv/kvm.c
index 9d8a8982f9..005e054604 100644
--- a/target/riscv/kvm.c
+++ b/target/riscv/kvm.c
@@ -914,7 +914,15 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
 
 int kvm_arch_irqchip_create(KVMState *s)
 {
-return 0;
+if (kvm_kernel_irqchip_split()) {
+error_report("-machine kernel_irqchip=split is not supported on 
RISC-V.");
+exit(1);
+}
+
+/*
+ * We can create the VAIA using the newer device control API.
+ */
+return kvm_check_extension(s, KVM_CAP_DEVICE_CTRL);
 }
 
 int kvm_arch_process_async_events(CPUState *cs)
-- 
2.17.1

[PATCH v5 0/5] Add RISC-V KVM AIA Support

2023-07-13 Thread Yong-Xuan Wang

This series adds support for KVM AIA in RISC-V architecture.

In order to test these patches, we require Linux with KVM AIA support which can
be found in the riscv_kvm_aia_hwaccel_v1 branch at
https://github.com/avpatel/linux.git

---
v5:
- remove the linux-header update patch since the riscv-to-apply.next QEMU has
synced up to Linux 6.5-rc1 headers.
- create the APLIC and IMSIC FDT helper functions in PATCH1
- add the irqfd support in PATCH3
- fix the comments and refine the code

v4:
- update the linux header by the scripts/update-linux-headers.sh in PATCH1
- remove the checking for "aplic_m" before creating S-mode APLIC device in 
PATCH2
- add more setting when we initialize the KVM AIA chip in PATCH4
- fix msi message delivery and the APLIC devices emulation in PATCH5
- fix the AIA devices mapping with NUMA enabled in PATCH6
- add "kvm-aia" parameter to sepecify the KVM AIA mode in PATCH6

v3:
- fix typo
- tag the linux-header patch as placeholder

v2:
- rebase to riscv-to-apply.next
- update the linux header by the scripts/update-linux-headers.sh

Yong-Xuan Wang (5):
  target/riscv: support the AIA device emulation with KVM enabled
  target/riscv: check the in-kernel irqchip support
  target/riscv: Create an KVM AIA irqchip
  target/riscv: update APLIC and IMSIC to support KVM AIA
  target/riscv: select KVM AIA in riscv virt machine

 hw/intc/riscv_aplic.c|  56 --
 hw/intc/riscv_imsic.c|  25 ++-
 hw/riscv/virt.c  | 381 ++-
 include/hw/riscv/virt.h  |   1 +
 target/riscv/kvm.c   | 170 -
 target/riscv/kvm_riscv.h |   6 +
 6 files changed, 452 insertions(+), 187 deletions(-)

-- 
2.17.1

[PATCH v5 5/5] target/riscv: select KVM AIA in riscv virt machine

2023-07-13 Thread Yong-Xuan Wang

Select KVM AIA when the host kernel has in-kernel AIA chip support.
Since KVM AIA only has one APLIC instance, we map the QEMU APLIC
devices to KVM APLIC.
We also extend virt machine to specify the KVM AIA mode. The "kvm-aia"
parameter is passed along with machine name in QEMU command-line.
1) "kvm-aia=emul": IMSIC is emulated by hypervisor
2) "kvm-aia=hwaccel": use hardware guest IMSIC
3) "kvm-aia=auto": use the hardware guest IMSICs whenever available
   otherwise we fallback to software emulation.

Signed-off-by: Yong-Xuan Wang 
Reviewed-by: Jim Shu 
Reviewed-by: Daniel Henrique Barboza 
---
 hw/riscv/virt.c | 125 ++--
 include/hw/riscv/virt.h |   1 +
 2 files changed, 97 insertions(+), 29 deletions(-)

diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index 26b3aff28e..b74142d978 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -35,6 +35,7 @@
 #include "hw/riscv/virt.h"
 #include "hw/riscv/boot.h"
 #include "hw/riscv/numa.h"
+#include "kvm_riscv.h"
 #include "hw/intc/riscv_aclint.h"
 #include "hw/intc/riscv_aplic.h"
 #include "hw/intc/riscv_imsic.h"
@@ -75,6 +76,12 @@
 #error "Can't accomodate all IMSIC groups in address space"
 #endif
 
+/* KVM AIA only supports APLIC MSI. APLIC Wired is always emulated by QEMU. */
+static bool virt_use_kvm_aia(RISCVVirtState *s)
+{
+return kvm_irqchip_in_kernel() && s->aia_type == VIRT_AIA_TYPE_APLIC_IMSIC;
+}
+
 static const MemMapEntry virt_memmap[] = {
 [VIRT_DEBUG] ={0x0, 0x100 },
 [VIRT_MROM] = { 0x1000,0xf000 },
@@ -609,16 +616,16 @@ static void create_fdt_one_aplic(RISCVVirtState *s, int 
socket,
  uint32_t *intc_phandles,
  uint32_t aplic_phandle,
  uint32_t aplic_child_phandle,
- bool m_mode)
+ bool m_mode, int num_harts)
 {
 int cpu;
 char *aplic_name;
 uint32_t *aplic_cells;
 MachineState *ms = MACHINE(s);
 
-aplic_cells = g_new0(uint32_t, s->soc[socket].num_harts * 2);
+aplic_cells = g_new0(uint32_t, num_harts * 2);
 
-for (cpu = 0; cpu < s->soc[socket].num_harts; cpu++) {
+for (cpu = 0; cpu < num_harts; cpu++) {
 aplic_cells[cpu * 2 + 0] = cpu_to_be32(intc_phandles[cpu]);
 aplic_cells[cpu * 2 + 1] = cpu_to_be32(m_mode ? IRQ_M_EXT : IRQ_S_EXT);
 }
@@ -632,7 +639,7 @@ static void create_fdt_one_aplic(RISCVVirtState *s, int 
socket,
 
 if (s->aia_type == VIRT_AIA_TYPE_APLIC) {
 qemu_fdt_setprop(ms->fdt, aplic_name, "interrupts-extended",
-aplic_cells, s->soc[socket].num_harts * sizeof(uint32_t) * 2);
+aplic_cells, num_harts * sizeof(uint32_t) * 2);
 } else {
 qemu_fdt_setprop_cell(ms->fdt, aplic_name, "msi-parent", msi_phandle);
 }
@@ -662,7 +669,8 @@ static void create_fdt_socket_aplic(RISCVVirtState *s,
 uint32_t msi_s_phandle,
 uint32_t *phandle,
 uint32_t *intc_phandles,
-uint32_t *aplic_phandles)
+uint32_t *aplic_phandles,
+int num_harts)
 {
 char *aplic_name;
 unsigned long aplic_addr;
@@ -679,7 +687,7 @@ static void create_fdt_socket_aplic(RISCVVirtState *s,
 create_fdt_one_aplic(s, socket, aplic_addr, memmap[VIRT_APLIC_M].size,
  msi_m_phandle, intc_phandles,
  aplic_m_phandle, aplic_s_phandle,
- true);
+ true, num_harts);
 }
 
 /* S-level APLIC node */
@@ -688,7 +696,7 @@ static void create_fdt_socket_aplic(RISCVVirtState *s,
 create_fdt_one_aplic(s, socket, aplic_addr, memmap[VIRT_APLIC_S].size,
  msi_s_phandle, intc_phandles,
  aplic_s_phandle, 0,
- false);
+ false, num_harts);
 
 aplic_name = g_strdup_printf("/soc/aplic@%lx", aplic_addr);
 
@@ -772,34 +780,48 @@ static void create_fdt_sockets(RISCVVirtState *s, const 
MemMapEntry *memmap,
 *msi_pcie_phandle = msi_s_phandle;
 }
 
-phandle_pos = ms->smp.cpus;
-for (socket = (socket_count - 1); socket >= 0; socket--) {
-phandle_pos -= s->soc[socket].num_harts;
+/* KVM AIA only has one APLIC instance */
+if (virt_use_kvm_aia(s)) {
+create_fdt_socket_aplic(s, memmap, 0,
+msi_m_phandle, msi_s_phandle, phandle,
+&intc_phandles[0], xplic_phandles, ms->smp.cpus);
+} else {
+phandle_pos = ms->smp.cpus;
+for (socket = (socket_count - 1); socket >= 0; socket--) {
+phandle_pos -= s->soc[socket].num_harts;
 
-if (s->aia_type == VIRT_AIA_TYPE_NONE) {
-create_fdt_socket_p

[PATCH v5 4/5] target/riscv: update APLIC and IMSIC to support KVM AIA

2023-07-13 Thread Yong-Xuan Wang

KVM AIA can't emulate APLIC only. When "aia=aplic" parameter is passed,
APLIC devices is emulated by QEMU. For "aia=aplic-imsic", remove the
mmio operations of APLIC when using KVM AIA and send wired interrupt
signal via KVM_IRQ_LINE API.
After KVM AIA enabled, MSI messages are delivered by KVM_SIGNAL_MSI API
when the IMSICs receive mmio write requests.

Signed-off-by: Yong-Xuan Wang 
Reviewed-by: Jim Shu 
Reviewed-by: Daniel Henrique Barboza 
Reviewed-by: Andrew Jones 
---
 hw/intc/riscv_aplic.c | 56 ++-
 hw/intc/riscv_imsic.c | 25 +++
 2 files changed, 61 insertions(+), 20 deletions(-)

diff --git a/hw/intc/riscv_aplic.c b/hw/intc/riscv_aplic.c
index 4bdc6a5d1a..eab5db053b 100644
--- a/hw/intc/riscv_aplic.c
+++ b/hw/intc/riscv_aplic.c
@@ -31,6 +31,7 @@
 #include "hw/irq.h"
 #include "target/riscv/cpu.h"
 #include "sysemu/sysemu.h"
+#include "sysemu/kvm.h"
 #include "migration/vmstate.h"
 
 #define APLIC_MAX_IDC  (1UL << 14)
@@ -148,6 +149,15 @@
 
 #define APLIC_IDC_CLAIMI   0x1c
 
+/*
+ * KVM AIA only supports APLIC MSI, fallback to QEMU emulation if we want to 
use
+ * APLIC Wired.
+ */
+static bool is_kvm_aia(bool msimode)
+{
+return kvm_irqchip_in_kernel() && msimode;
+}
+
 static uint32_t riscv_aplic_read_input_word(RISCVAPLICState *aplic,
 uint32_t word)
 {
@@ -471,6 +481,11 @@ static uint32_t riscv_aplic_idc_claimi(RISCVAPLICState 
*aplic, uint32_t idc)
 return topi;
 }
 
+static void riscv_kvm_aplic_request(void *opaque, int irq, int level)
+{
+kvm_set_irq(kvm_state, irq, !!level);
+}
+
 static void riscv_aplic_request(void *opaque, int irq, int level)
 {
 bool update = false;
@@ -801,29 +816,35 @@ static void riscv_aplic_realize(DeviceState *dev, Error 
**errp)
 uint32_t i;
 RISCVAPLICState *aplic = RISCV_APLIC(dev);
 
-aplic->bitfield_words = (aplic->num_irqs + 31) >> 5;
-aplic->sourcecfg = g_new0(uint32_t, aplic->num_irqs);
-aplic->state = g_new0(uint32_t, aplic->num_irqs);
-aplic->target = g_new0(uint32_t, aplic->num_irqs);
-if (!aplic->msimode) {
-for (i = 0; i < aplic->num_irqs; i++) {
-aplic->target[i] = 1;
+if (!is_kvm_aia(aplic->msimode)) {
+aplic->bitfield_words = (aplic->num_irqs + 31) >> 5;
+aplic->sourcecfg = g_new0(uint32_t, aplic->num_irqs);
+aplic->state = g_new0(uint32_t, aplic->num_irqs);
+aplic->target = g_new0(uint32_t, aplic->num_irqs);
+if (!aplic->msimode) {
+for (i = 0; i < aplic->num_irqs; i++) {
+aplic->target[i] = 1;
+}
 }
-}
-aplic->idelivery = g_new0(uint32_t, aplic->num_harts);
-aplic->iforce = g_new0(uint32_t, aplic->num_harts);
-aplic->ithreshold = g_new0(uint32_t, aplic->num_harts);
+aplic->idelivery = g_new0(uint32_t, aplic->num_harts);
+aplic->iforce = g_new0(uint32_t, aplic->num_harts);
+aplic->ithreshold = g_new0(uint32_t, aplic->num_harts);
 
-memory_region_init_io(&aplic->mmio, OBJECT(dev), &riscv_aplic_ops, aplic,
-  TYPE_RISCV_APLIC, aplic->aperture_size);
-sysbus_init_mmio(SYS_BUS_DEVICE(dev), &aplic->mmio);
+memory_region_init_io(&aplic->mmio, OBJECT(dev), &riscv_aplic_ops,
+ aplic, TYPE_RISCV_APLIC, aplic->aperture_size);
+sysbus_init_mmio(SYS_BUS_DEVICE(dev), &aplic->mmio);
+}
 
 /*
  * Only root APLICs have hardware IRQ lines. All non-root APLICs
  * have IRQ lines delegated by their parent APLIC.
  */
 if (!aplic->parent) {
-qdev_init_gpio_in(dev, riscv_aplic_request, aplic->num_irqs);
+if (is_kvm_aia(aplic->msimode)) {
+qdev_init_gpio_in(dev, riscv_kvm_aplic_request, aplic->num_irqs);
+} else {
+qdev_init_gpio_in(dev, riscv_aplic_request, aplic->num_irqs);
+}
 }
 
 /* Create output IRQ lines for non-MSI mode */
@@ -958,7 +979,10 @@ DeviceState *riscv_aplic_create(hwaddr addr, hwaddr size,
 qdev_prop_set_bit(dev, "mmode", mmode);
 
 sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
-sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, addr);
+
+if (!is_kvm_aia(msimode)) {
+sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, addr);
+}
 
 if (parent) {
 riscv_aplic_add_child(parent, dev);
diff --git a/hw/intc/riscv_imsic.c b/hw/intc/riscv_imsic.c
index fea3385b51..760dbddcf7 100644
--- a/hw/intc/riscv_imsic.c
+++ b/hw/intc/riscv_imsic.c
@@ -32,6 +32,7 @@
 #include "target/riscv/cpu.h"
 #include "target/riscv/cpu_bits.h"
 #include "sysemu/sysemu.h"
+#include "sysemu/kvm.h"
 #include "migration/vmstate.h"
 
 #define IMSIC_MMIO_PAGE_LE 0x00
@@ -283,6 +284,20 @@ static void riscv_imsic_write(void *opaque, hwaddr addr, 
uint64_t value,
 goto err;
 }
 
+#if defined(CONFIG_KVM)
+if (kvm_irqchip_in_kernel()) {
+struct kvm

[PATCH v5 3/5] target/riscv: Create an KVM AIA irqchip

2023-07-13 Thread Yong-Xuan Wang

We create a vAIA chip by using the KVM_DEV_TYPE_RISCV_AIA and then set up
the chip with the KVM_DEV_RISCV_AIA_GRP_* APIs.

Signed-off-by: Yong-Xuan Wang 
Reviewed-by: Jim Shu 
Reviewed-by: Daniel Henrique Barboza 
---
 target/riscv/kvm.c   | 160 +++
 target/riscv/kvm_riscv.h |   6 ++
 2 files changed, 166 insertions(+)

diff --git a/target/riscv/kvm.c b/target/riscv/kvm.c
index 005e054604..64156c15ec 100644
--- a/target/riscv/kvm.c
+++ b/target/riscv/kvm.c
@@ -36,6 +36,7 @@
 #include "exec/address-spaces.h"
 #include "hw/boards.h"
 #include "hw/irq.h"
+#include "hw/intc/riscv_imsic.h"
 #include "qemu/log.h"
 #include "hw/loader.h"
 #include "kvm_riscv.h"
@@ -43,6 +44,7 @@
 #include "chardev/char-fe.h"
 #include "migration/migration.h"
 #include "sysemu/runstate.h"
+#include "hw/riscv/numa.h"
 
 static uint64_t kvm_riscv_reg_id(CPURISCVState *env, uint64_t type,
  uint64_t idx)
@@ -1026,3 +1028,161 @@ bool kvm_arch_cpu_check_are_resettable(void)
 void kvm_arch_accel_class_init(ObjectClass *oc)
 {
 }
+
+char *kvm_aia_mode_str(uint64_t aia_mode)
+{
+const char *val;
+
+switch (aia_mode) {
+case KVM_DEV_RISCV_AIA_MODE_EMUL:
+return "emul";
+case KVM_DEV_RISCV_AIA_MODE_HWACCEL:
+return "hwaccel";
+case KVM_DEV_RISCV_AIA_MODE_AUTO:
+default:
+return "auto";
+};
+}
+
+void kvm_riscv_aia_create(MachineState *machine,
+  uint64_t aia_mode, uint64_t group_shift,
+  uint64_t aia_irq_num, uint64_t aia_msi_num,
+  uint64_t aplic_base, uint64_t imsic_base,
+  uint64_t guest_num)
+{
+int ret, i;
+int aia_fd = -1;
+uint64_t default_aia_mode;
+uint64_t socket_count = riscv_socket_count(machine);
+uint64_t max_hart_per_socket = 0;
+uint64_t socket, base_hart, hart_count, socket_imsic_base, imsic_addr;
+uint64_t socket_bits, hart_bits, guest_bits;
+
+aia_fd = kvm_create_device(kvm_state, KVM_DEV_TYPE_RISCV_AIA, false);
+
+if (aia_fd < 0) {
+error_report("Unable to create in-kernel irqchip");
+exit(1);
+}
+
+ret = kvm_device_access(aia_fd, KVM_DEV_RISCV_AIA_GRP_CONFIG,
+KVM_DEV_RISCV_AIA_CONFIG_MODE,
+&default_aia_mode, false, NULL);
+if (ret < 0) {
+error_report("KVM AIA: failed to get current KVM AIA mode");
+exit(1);
+}
+qemu_log("KVM AIA: default mode is %s\n",
+ kvm_aia_mode_str(default_aia_mode));
+
+if (default_aia_mode != aia_mode) {
+ret = kvm_device_access(aia_fd, KVM_DEV_RISCV_AIA_GRP_CONFIG,
+KVM_DEV_RISCV_AIA_CONFIG_MODE,
+&aia_mode, true, NULL);
+if (ret < 0)
+warn_report("KVM AIA: failed to set KVM AIA mode");
+else
+qemu_log("KVM AIA: set current mode to %s\n",
+ kvm_aia_mode_str(aia_mode));
+}
+
+ret = kvm_device_access(aia_fd, KVM_DEV_RISCV_AIA_GRP_CONFIG,
+KVM_DEV_RISCV_AIA_CONFIG_SRCS,
+&aia_irq_num, true, NULL);
+if (ret < 0) {
+error_report("KVM AIA: failed to set number of input irq lines");
+exit(1);
+}
+
+ret = kvm_device_access(aia_fd, KVM_DEV_RISCV_AIA_GRP_CONFIG,
+KVM_DEV_RISCV_AIA_CONFIG_IDS,
+&aia_msi_num, true, NULL);
+if (ret < 0) {
+error_report("KVM AIA: failed to set number of msi");
+exit(1);
+}
+
+socket_bits = find_last_bit(&socket_count, BITS_PER_LONG) + 1;
+ret = kvm_device_access(aia_fd, KVM_DEV_RISCV_AIA_GRP_CONFIG,
+KVM_DEV_RISCV_AIA_CONFIG_GROUP_BITS,
+&socket_bits, true, NULL);
+if (ret < 0) {
+error_report("KVM AIA: failed to set group_bits");
+exit(1);
+}
+
+ret = kvm_device_access(aia_fd, KVM_DEV_RISCV_AIA_GRP_CONFIG,
+KVM_DEV_RISCV_AIA_CONFIG_GROUP_SHIFT,
+&group_shift, true, NULL);
+if (ret < 0) {
+error_report("KVM AIA: failed to set group_shift");
+exit(1);
+}
+
+guest_bits = guest_num == 0 ? 0 :
+ find_last_bit(&guest_num, BITS_PER_LONG) + 1;
+ret = kvm_device_access(aia_fd, KVM_DEV_RISCV_AIA_GRP_CONFIG,
+KVM_DEV_RISCV_AIA_CONFIG_GUEST_BITS,
+&guest_bits, true, NULL);
+if (ret < 0) {
+error_report("KVM AIA: failed to set guest_bits");
+exit(1);
+}
+
+ret = kvm_device_access(aia_fd, KVM_DEV_RISCV_AIA_GRP_ADDR,
+KVM_DEV_RISCV_AIA_ADDR_APLIC,
+&aplic_base, true, NULL);
+if (ret < 0) {
+error_report("KVM AIA: failed to set the base address of APLIC");
+exit(

[PATCH v5 1/5] target/riscv: support the AIA device emulation with KVM enabled

2023-07-13 Thread Yong-Xuan Wang

In this patch, we create the APLIC and IMSIC FDT helper functions and
remove M mode AIA devices when using KVM acceleration.

Signed-off-by: Yong-Xuan Wang 
Reviewed-by: Jim Shu 
Reviewed-by: Daniel Henrique Barboza 
---
 hw/riscv/virt.c | 264 ++--
 1 file changed, 123 insertions(+), 141 deletions(-)

diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index d90286dc46..26b3aff28e 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -516,79 +516,28 @@ static uint32_t imsic_num_bits(uint32_t count)
 return ret;
 }
 
-static void create_fdt_imsic(RISCVVirtState *s, const MemMapEntry *memmap,
- uint32_t *phandle, uint32_t *intc_phandles,
- uint32_t *msi_m_phandle, uint32_t *msi_s_phandle)
+static void create_fdt_one_imsic(RISCVVirtState *s, hwaddr base_addr,
+ uint32_t *intc_phandles, uint32_t msi_phandle,
+ bool m_mode, uint32_t imsic_guest_bits)
 {
 int cpu, socket;
 char *imsic_name;
 MachineState *ms = MACHINE(s);
 int socket_count = riscv_socket_count(ms);
-uint32_t imsic_max_hart_per_socket, imsic_guest_bits;
+uint32_t imsic_max_hart_per_socket;
 uint32_t *imsic_cells, *imsic_regs, imsic_addr, imsic_size;
 
-*msi_m_phandle = (*phandle)++;
-*msi_s_phandle = (*phandle)++;
 imsic_cells = g_new0(uint32_t, ms->smp.cpus * 2);
 imsic_regs = g_new0(uint32_t, socket_count * 4);
 
-/* M-level IMSIC node */
 for (cpu = 0; cpu < ms->smp.cpus; cpu++) {
 imsic_cells[cpu * 2 + 0] = cpu_to_be32(intc_phandles[cpu]);
-imsic_cells[cpu * 2 + 1] = cpu_to_be32(IRQ_M_EXT);
-}
-imsic_max_hart_per_socket = 0;
-for (socket = 0; socket < socket_count; socket++) {
-imsic_addr = memmap[VIRT_IMSIC_M].base +
- socket * VIRT_IMSIC_GROUP_MAX_SIZE;
-imsic_size = IMSIC_HART_SIZE(0) * s->soc[socket].num_harts;
-imsic_regs[socket * 4 + 0] = 0;
-imsic_regs[socket * 4 + 1] = cpu_to_be32(imsic_addr);
-imsic_regs[socket * 4 + 2] = 0;
-imsic_regs[socket * 4 + 3] = cpu_to_be32(imsic_size);
-if (imsic_max_hart_per_socket < s->soc[socket].num_harts) {
-imsic_max_hart_per_socket = s->soc[socket].num_harts;
-}
-}
-imsic_name = g_strdup_printf("/soc/imsics@%lx",
-(unsigned long)memmap[VIRT_IMSIC_M].base);
-qemu_fdt_add_subnode(ms->fdt, imsic_name);
-qemu_fdt_setprop_string(ms->fdt, imsic_name, "compatible",
-"riscv,imsics");
-qemu_fdt_setprop_cell(ms->fdt, imsic_name, "#interrupt-cells",
-FDT_IMSIC_INT_CELLS);
-qemu_fdt_setprop(ms->fdt, imsic_name, "interrupt-controller",
-NULL, 0);
-qemu_fdt_setprop(ms->fdt, imsic_name, "msi-controller",
-NULL, 0);
-qemu_fdt_setprop(ms->fdt, imsic_name, "interrupts-extended",
-imsic_cells, ms->smp.cpus * sizeof(uint32_t) * 2);
-qemu_fdt_setprop(ms->fdt, imsic_name, "reg", imsic_regs,
-socket_count * sizeof(uint32_t) * 4);
-qemu_fdt_setprop_cell(ms->fdt, imsic_name, "riscv,num-ids",
-VIRT_IRQCHIP_NUM_MSIS);
-if (socket_count > 1) {
-qemu_fdt_setprop_cell(ms->fdt, imsic_name, "riscv,hart-index-bits",
-imsic_num_bits(imsic_max_hart_per_socket));
-qemu_fdt_setprop_cell(ms->fdt, imsic_name, "riscv,group-index-bits",
-imsic_num_bits(socket_count));
-qemu_fdt_setprop_cell(ms->fdt, imsic_name, "riscv,group-index-shift",
-IMSIC_MMIO_GROUP_MIN_SHIFT);
+imsic_cells[cpu * 2 + 1] = cpu_to_be32(m_mode ? IRQ_M_EXT : IRQ_S_EXT);
 }
-qemu_fdt_setprop_cell(ms->fdt, imsic_name, "phandle", *msi_m_phandle);
 
-g_free(imsic_name);
-
-/* S-level IMSIC node */
-for (cpu = 0; cpu < ms->smp.cpus; cpu++) {
-imsic_cells[cpu * 2 + 0] = cpu_to_be32(intc_phandles[cpu]);
-imsic_cells[cpu * 2 + 1] = cpu_to_be32(IRQ_S_EXT);
-}
-imsic_guest_bits = imsic_num_bits(s->aia_guests + 1);
 imsic_max_hart_per_socket = 0;
 for (socket = 0; socket < socket_count; socket++) {
-imsic_addr = memmap[VIRT_IMSIC_S].base +
- socket * VIRT_IMSIC_GROUP_MAX_SIZE;
+imsic_addr = base_addr + socket * VIRT_IMSIC_GROUP_MAX_SIZE;
 imsic_size = IMSIC_HART_SIZE(imsic_guest_bits) *
  s->soc[socket].num_harts;
 imsic_regs[socket * 4 + 0] = 0;
@@ -599,119 +548,149 @@ static void create_fdt_imsic(RISCVVirtState *s, const 
MemMapEntry *memmap,
 imsic_max_hart_per_socket = s->soc[socket].num_harts;
 }
 }
-imsic_name = g_strdup_printf("/soc/imsics@%lx",
-(unsigned long)memmap[VIRT_IMSIC_S].base);
+
+imsic_name = g_strdup_printf("/soc/imsics@%lx", (unsigned long)base_addr);
 qemu_fdt_add_subnode(ms->fdt, imsic_name);
-qemu_fdt_setprop_string(ms->fdt, imsic_name, "compatible",
-"riscv,imsics");
+qemu

Re: [PATCH v2] hw/mips: Improve the default USB settings in the loongson3-virt machine

2023-07-13 Thread Thomas Huth


On 21/06/2023 09.41, Thomas Huth wrote:

It's possible to compile QEMU without the USB devices (e.g. when using
"--without-default-devices" as option for the "configure" script).
To be still able to run the loongson3-virt machine in default mode with
such a QEMU binary, we have to check here for the availability of the
USB devices first before instantiating them.

Signed-off-by: Thomas Huth 
---
  v2: Use #ifdef instead of runtime check

  hw/mips/loongson3_virt.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/hw/mips/loongson3_virt.c b/hw/mips/loongson3_virt.c
index 216812f660..3094413eea 100644
--- a/hw/mips/loongson3_virt.c
+++ b/hw/mips/loongson3_virt.c
@@ -51,6 +51,7 @@
  #include "sysemu/reset.h"
  #include "sysemu/runstate.h"
  #include "qemu/error-report.h"
+#include CONFIG_DEVICES
  
  #define PM_CNTL_MODE  0x10
  
@@ -447,11 +448,13 @@ static inline void loongson3_virt_devices_init(MachineState *machine,
  
  pci_vga_init(pci_bus);
  
+#ifdef CONFIG_USB_OHCI_PCI

  if (defaults_enabled()) {
  pci_create_simple(pci_bus, -1, "pci-ohci");
  usb_create_simple(usb_bus_find(-1), "usb-kbd");
  usb_create_simple(usb_bus_find(-1), "usb-tablet");
  }
+#endif
  
  for (i = 0; i < nb_nics; i++) {

  NICInfo *nd = &nd_table[i];


Ping?

 Thomas

Re: [PATCH v5 1/5] target/riscv: support the AIA device emulation with KVM enabled

2023-07-13 Thread Andrew Jones

On Thu, Jul 13, 2023 at 08:43:53AM +, Yong-Xuan Wang wrote:
> In this patch, we create the APLIC and IMSIC FDT helper functions and
> remove M mode AIA devices when using KVM acceleration.
> 
> Signed-off-by: Yong-Xuan Wang 
> Reviewed-by: Jim Shu 
> Reviewed-by: Daniel Henrique Barboza 
> ---
>  hw/riscv/virt.c | 264 ++--
>  1 file changed, 123 insertions(+), 141 deletions(-)
> 
> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
> index d90286dc46..26b3aff28e 100644
> --- a/hw/riscv/virt.c
> +++ b/hw/riscv/virt.c
> @@ -516,79 +516,28 @@ static uint32_t imsic_num_bits(uint32_t count)
>  return ret;
>  }
>  
> -static void create_fdt_imsic(RISCVVirtState *s, const MemMapEntry *memmap,
> - uint32_t *phandle, uint32_t *intc_phandles,
> - uint32_t *msi_m_phandle, uint32_t 
> *msi_s_phandle)
> +static void create_fdt_one_imsic(RISCVVirtState *s, hwaddr base_addr,
> + uint32_t *intc_phandles, uint32_t 
> msi_phandle,
> + bool m_mode, uint32_t imsic_guest_bits)
>  {
>  int cpu, socket;
>  char *imsic_name;
>  MachineState *ms = MACHINE(s);
>  int socket_count = riscv_socket_count(ms);
> -uint32_t imsic_max_hart_per_socket, imsic_guest_bits;
> +uint32_t imsic_max_hart_per_socket;
>  uint32_t *imsic_cells, *imsic_regs, imsic_addr, imsic_size;
>  
> -*msi_m_phandle = (*phandle)++;
> -*msi_s_phandle = (*phandle)++;
>  imsic_cells = g_new0(uint32_t, ms->smp.cpus * 2);
>  imsic_regs = g_new0(uint32_t, socket_count * 4);
>  
> -/* M-level IMSIC node */
>  for (cpu = 0; cpu < ms->smp.cpus; cpu++) {
>  imsic_cells[cpu * 2 + 0] = cpu_to_be32(intc_phandles[cpu]);
> -imsic_cells[cpu * 2 + 1] = cpu_to_be32(IRQ_M_EXT);
> -}
> -imsic_max_hart_per_socket = 0;
> -for (socket = 0; socket < socket_count; socket++) {
> -imsic_addr = memmap[VIRT_IMSIC_M].base +
> - socket * VIRT_IMSIC_GROUP_MAX_SIZE;
> -imsic_size = IMSIC_HART_SIZE(0) * s->soc[socket].num_harts;
> -imsic_regs[socket * 4 + 0] = 0;
> -imsic_regs[socket * 4 + 1] = cpu_to_be32(imsic_addr);
> -imsic_regs[socket * 4 + 2] = 0;
> -imsic_regs[socket * 4 + 3] = cpu_to_be32(imsic_size);
> -if (imsic_max_hart_per_socket < s->soc[socket].num_harts) {
> -imsic_max_hart_per_socket = s->soc[socket].num_harts;
> -}
> -}
> -imsic_name = g_strdup_printf("/soc/imsics@%lx",
> -(unsigned long)memmap[VIRT_IMSIC_M].base);
> -qemu_fdt_add_subnode(ms->fdt, imsic_name);
> -qemu_fdt_setprop_string(ms->fdt, imsic_name, "compatible",
> -"riscv,imsics");
> -qemu_fdt_setprop_cell(ms->fdt, imsic_name, "#interrupt-cells",
> -FDT_IMSIC_INT_CELLS);
> -qemu_fdt_setprop(ms->fdt, imsic_name, "interrupt-controller",
> -NULL, 0);
> -qemu_fdt_setprop(ms->fdt, imsic_name, "msi-controller",
> -NULL, 0);
> -qemu_fdt_setprop(ms->fdt, imsic_name, "interrupts-extended",
> -imsic_cells, ms->smp.cpus * sizeof(uint32_t) * 2);
> -qemu_fdt_setprop(ms->fdt, imsic_name, "reg", imsic_regs,
> -socket_count * sizeof(uint32_t) * 4);
> -qemu_fdt_setprop_cell(ms->fdt, imsic_name, "riscv,num-ids",
> -VIRT_IRQCHIP_NUM_MSIS);
> -if (socket_count > 1) {
> -qemu_fdt_setprop_cell(ms->fdt, imsic_name, "riscv,hart-index-bits",
> -imsic_num_bits(imsic_max_hart_per_socket));
> -qemu_fdt_setprop_cell(ms->fdt, imsic_name, "riscv,group-index-bits",
> -imsic_num_bits(socket_count));
> -qemu_fdt_setprop_cell(ms->fdt, imsic_name, "riscv,group-index-shift",
> -IMSIC_MMIO_GROUP_MIN_SHIFT);
> +imsic_cells[cpu * 2 + 1] = cpu_to_be32(m_mode ? IRQ_M_EXT : 
> IRQ_S_EXT);
>  }
> -qemu_fdt_setprop_cell(ms->fdt, imsic_name, "phandle", *msi_m_phandle);
>  
> -g_free(imsic_name);
> -
> -/* S-level IMSIC node */
> -for (cpu = 0; cpu < ms->smp.cpus; cpu++) {
> -imsic_cells[cpu * 2 + 0] = cpu_to_be32(intc_phandles[cpu]);
> -imsic_cells[cpu * 2 + 1] = cpu_to_be32(IRQ_S_EXT);
> -}
> -imsic_guest_bits = imsic_num_bits(s->aia_guests + 1);
>  imsic_max_hart_per_socket = 0;
>  for (socket = 0; socket < socket_count; socket++) {
> -imsic_addr = memmap[VIRT_IMSIC_S].base +
> - socket * VIRT_IMSIC_GROUP_MAX_SIZE;
> +imsic_addr = base_addr + socket * VIRT_IMSIC_GROUP_MAX_SIZE;
>  imsic_size = IMSIC_HART_SIZE(imsic_guest_bits) *
>   s->soc[socket].num_harts;
>  imsic_regs[socket * 4 + 0] = 0;
> @@ -599,119 +548,149 @@ static void create_fdt_imsic(RISCVVirtState *s, const 
> MemMapEntry *memmap,
>  imsic_max_hart_per_socket = s->soc[socket].num_harts;
>  }
>  }
> -imsic_name = g_strdup_printf("/soc/imsics@%l

Re: [PATCH v5 2/5] target/riscv: check the in-kernel irqchip support

2023-07-13 Thread Andrew Jones

On Thu, Jul 13, 2023 at 08:43:54AM +, Yong-Xuan Wang wrote:
> We check the in-kernel irqchip support when using KVM acceleration.
> 
> Signed-off-by: Yong-Xuan Wang 
> Reviewed-by: Jim Shu 
> Reviewed-by: Daniel Henrique Barboza 
> ---
>  target/riscv/kvm.c | 10 +-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/target/riscv/kvm.c b/target/riscv/kvm.c
> index 9d8a8982f9..005e054604 100644
> --- a/target/riscv/kvm.c
> +++ b/target/riscv/kvm.c
> @@ -914,7 +914,15 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>  
>  int kvm_arch_irqchip_create(KVMState *s)
>  {
> -return 0;
> +if (kvm_kernel_irqchip_split()) {
> +error_report("-machine kernel_irqchip=split is not supported on 
> RISC-V.");
> +exit(1);
> +}
> +
> +/*
> + * We can create the VAIA using the newer device control API.
> + */
> +return kvm_check_extension(s, KVM_CAP_DEVICE_CTRL);
>  }
>  
>  int kvm_arch_process_async_events(CPUState *cs)
> -- 
> 2.17.1
>

Reviewed-by: Andrew Jones

Re: [PATCH v5 3/5] target/riscv: Create an KVM AIA irqchip

2023-07-13 Thread Andrew Jones

On Thu, Jul 13, 2023 at 08:43:55AM +, Yong-Xuan Wang wrote:
> We create a vAIA chip by using the KVM_DEV_TYPE_RISCV_AIA and then set up
> the chip with the KVM_DEV_RISCV_AIA_GRP_* APIs.
> 
> Signed-off-by: Yong-Xuan Wang 
> Reviewed-by: Jim Shu 
> Reviewed-by: Daniel Henrique Barboza 
> ---
>  target/riscv/kvm.c   | 160 +++
>  target/riscv/kvm_riscv.h |   6 ++
>  2 files changed, 166 insertions(+)
> 
> diff --git a/target/riscv/kvm.c b/target/riscv/kvm.c
> index 005e054604..64156c15ec 100644
> --- a/target/riscv/kvm.c
> +++ b/target/riscv/kvm.c
> @@ -36,6 +36,7 @@
>  #include "exec/address-spaces.h"
>  #include "hw/boards.h"
>  #include "hw/irq.h"
> +#include "hw/intc/riscv_imsic.h"
>  #include "qemu/log.h"
>  #include "hw/loader.h"
>  #include "kvm_riscv.h"
> @@ -43,6 +44,7 @@
>  #include "chardev/char-fe.h"
>  #include "migration/migration.h"
>  #include "sysemu/runstate.h"
> +#include "hw/riscv/numa.h"
>  
>  static uint64_t kvm_riscv_reg_id(CPURISCVState *env, uint64_t type,
>   uint64_t idx)
> @@ -1026,3 +1028,161 @@ bool kvm_arch_cpu_check_are_resettable(void)
>  void kvm_arch_accel_class_init(ObjectClass *oc)
>  {
>  }
> +
> +char *kvm_aia_mode_str(uint64_t aia_mode)
> +{
> +const char *val;
> +
> +switch (aia_mode) {
> +case KVM_DEV_RISCV_AIA_MODE_EMUL:
> +return "emul";
> +case KVM_DEV_RISCV_AIA_MODE_HWACCEL:
> +return "hwaccel";
> +case KVM_DEV_RISCV_AIA_MODE_AUTO:
> +default:
> +return "auto";
> +};
> +}
> +
> +void kvm_riscv_aia_create(MachineState *machine,
> +  uint64_t aia_mode, uint64_t group_shift,
> +  uint64_t aia_irq_num, uint64_t aia_msi_num,
> +  uint64_t aplic_base, uint64_t imsic_base,
> +  uint64_t guest_num)
> +{
> +int ret, i;
> +int aia_fd = -1;
> +uint64_t default_aia_mode;
> +uint64_t socket_count = riscv_socket_count(machine);
> +uint64_t max_hart_per_socket = 0;
> +uint64_t socket, base_hart, hart_count, socket_imsic_base, imsic_addr;
> +uint64_t socket_bits, hart_bits, guest_bits;
> +
> +aia_fd = kvm_create_device(kvm_state, KVM_DEV_TYPE_RISCV_AIA, false);
> +
> +if (aia_fd < 0) {
> +error_report("Unable to create in-kernel irqchip");
> +exit(1);
> +}
> +
> +ret = kvm_device_access(aia_fd, KVM_DEV_RISCV_AIA_GRP_CONFIG,
> +KVM_DEV_RISCV_AIA_CONFIG_MODE,
> +&default_aia_mode, false, NULL);
> +if (ret < 0) {
> +error_report("KVM AIA: failed to get current KVM AIA mode");
> +exit(1);
> +}
> +qemu_log("KVM AIA: default mode is %s\n",
> + kvm_aia_mode_str(default_aia_mode));
> +
> +if (default_aia_mode != aia_mode) {
> +ret = kvm_device_access(aia_fd, KVM_DEV_RISCV_AIA_GRP_CONFIG,
> +KVM_DEV_RISCV_AIA_CONFIG_MODE,
> +&aia_mode, true, NULL);
> +if (ret < 0)
> +warn_report("KVM AIA: failed to set KVM AIA mode");
> +else
> +qemu_log("KVM AIA: set current mode to %s\n",
> + kvm_aia_mode_str(aia_mode));
> +}
> +
> +ret = kvm_device_access(aia_fd, KVM_DEV_RISCV_AIA_GRP_CONFIG,
> +KVM_DEV_RISCV_AIA_CONFIG_SRCS,
> +&aia_irq_num, true, NULL);
> +if (ret < 0) {
> +error_report("KVM AIA: failed to set number of input irq lines");
> +exit(1);
> +}
> +
> +ret = kvm_device_access(aia_fd, KVM_DEV_RISCV_AIA_GRP_CONFIG,
> +KVM_DEV_RISCV_AIA_CONFIG_IDS,
> +&aia_msi_num, true, NULL);
> +if (ret < 0) {
> +error_report("KVM AIA: failed to set number of msi");
> +exit(1);
> +}
> +
> +socket_bits = find_last_bit(&socket_count, BITS_PER_LONG) + 1;
> +ret = kvm_device_access(aia_fd, KVM_DEV_RISCV_AIA_GRP_CONFIG,
> +KVM_DEV_RISCV_AIA_CONFIG_GROUP_BITS,
> +&socket_bits, true, NULL);
> +if (ret < 0) {
> +error_report("KVM AIA: failed to set group_bits");
> +exit(1);
> +}
> +
> +ret = kvm_device_access(aia_fd, KVM_DEV_RISCV_AIA_GRP_CONFIG,
> +KVM_DEV_RISCV_AIA_CONFIG_GROUP_SHIFT,
> +&group_shift, true, NULL);
> +if (ret < 0) {
> +error_report("KVM AIA: failed to set group_shift");
> +exit(1);
> +}
> +
> +guest_bits = guest_num == 0 ? 0 :
> + find_last_bit(&guest_num, BITS_PER_LONG) + 1;
> +ret = kvm_device_access(aia_fd, KVM_DEV_RISCV_AIA_GRP_CONFIG,
> +KVM_DEV_RISCV_AIA_CONFIG_GUEST_BITS,
> +&guest_bits, true, NULL);
> +if (ret < 0) {
> +error_report("KVM AIA: failed to set

Re: qemu-img convert gets 403 from s3 presigned urls

2023-07-13 Thread Philippe Mathieu-Daudé


Cc'ing the qemu-block@ list.

On 13/7/23 04:03, Ross Vandegrift wrote:

Hi folks,

I'm trying to qemu-img convert a source that's hoted on s3, via a
presigned url.  When qemu-img hits it, s3 returns a 403.  But curl works
fine with the same url.

I didn't see any debugging or verbose output options in the man page.
Is there a way to get more info on the requests as they're made?

Assuming that awscli is installed and setup with credentials, the issue
looks like:

$ disk_url=$(aws s3 presign s3://bucket/path/to/image.qcow2)
$ qemu-img convert -f qcow2 -O raw "$disk_url" image.raw
qemu-img: Could not open 
'https://bucket.s3.us-east-2.amazonaws.com/path/to/image.qcow2?X-Amz-Algorithm=AWS4-HMAC-SHA256...[redacted]':
 CURL: Error opening file: The requested URL returned error: 403
$ curl -o image.qcow2 "$disk_url"
   % Total% Received % Xferd  Average Speed   TimeTime Time  Current
  Dload  Upload   Total   SpentLeft  Speed
   4 3322M4  160M0 0  13.5M  0  0:04:04  0:00:11  0:03:53 15.2M

Thanks for any pointers,
Ross

PS - please CC me, as I'm not subscribed.  Thanks!

Re: [PATCH v5 5/5] target/riscv: select KVM AIA in riscv virt machine

2023-07-13 Thread Andrew Jones

On Thu, Jul 13, 2023 at 08:43:57AM +, Yong-Xuan Wang wrote:
> Select KVM AIA when the host kernel has in-kernel AIA chip support.
> Since KVM AIA only has one APLIC instance, we map the QEMU APLIC
> devices to KVM APLIC.
> We also extend virt machine to specify the KVM AIA mode. The "kvm-aia"
> parameter is passed along with machine name in QEMU command-line.
> 1) "kvm-aia=emul": IMSIC is emulated by hypervisor
> 2) "kvm-aia=hwaccel": use hardware guest IMSIC
> 3) "kvm-aia=auto": use the hardware guest IMSICs whenever available
>otherwise we fallback to software emulation.
> 
> Signed-off-by: Yong-Xuan Wang 
> Reviewed-by: Jim Shu 
> Reviewed-by: Daniel Henrique Barboza 
> ---
>  hw/riscv/virt.c | 125 ++--
>  include/hw/riscv/virt.h |   1 +
>  2 files changed, 97 insertions(+), 29 deletions(-)
> 
> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
> index 26b3aff28e..b74142d978 100644
> --- a/hw/riscv/virt.c
> +++ b/hw/riscv/virt.c
> @@ -35,6 +35,7 @@
>  #include "hw/riscv/virt.h"
>  #include "hw/riscv/boot.h"
>  #include "hw/riscv/numa.h"
> +#include "kvm_riscv.h"
>  #include "hw/intc/riscv_aclint.h"
>  #include "hw/intc/riscv_aplic.h"
>  #include "hw/intc/riscv_imsic.h"
> @@ -75,6 +76,12 @@
>  #error "Can't accomodate all IMSIC groups in address space"
>  #endif
>  
> +/* KVM AIA only supports APLIC MSI. APLIC Wired is always emulated by QEMU. 
> */
> +static bool virt_use_kvm_aia(RISCVVirtState *s)
> +{
> +return kvm_irqchip_in_kernel() && s->aia_type == 
> VIRT_AIA_TYPE_APLIC_IMSIC;
> +}
> +
>  static const MemMapEntry virt_memmap[] = {
>  [VIRT_DEBUG] ={0x0, 0x100 },
>  [VIRT_MROM] = { 0x1000,0xf000 },
> @@ -609,16 +616,16 @@ static void create_fdt_one_aplic(RISCVVirtState *s, int 
> socket,
>   uint32_t *intc_phandles,
>   uint32_t aplic_phandle,
>   uint32_t aplic_child_phandle,
> - bool m_mode)
> + bool m_mode, int num_harts)
>  {
>  int cpu;
>  char *aplic_name;
>  uint32_t *aplic_cells;
>  MachineState *ms = MACHINE(s);
>  
> -aplic_cells = g_new0(uint32_t, s->soc[socket].num_harts * 2);
> +aplic_cells = g_new0(uint32_t, num_harts * 2);
>  
> -for (cpu = 0; cpu < s->soc[socket].num_harts; cpu++) {
> +for (cpu = 0; cpu < num_harts; cpu++) {
>  aplic_cells[cpu * 2 + 0] = cpu_to_be32(intc_phandles[cpu]);
>  aplic_cells[cpu * 2 + 1] = cpu_to_be32(m_mode ? IRQ_M_EXT : 
> IRQ_S_EXT);
>  }
> @@ -632,7 +639,7 @@ static void create_fdt_one_aplic(RISCVVirtState *s, int 
> socket,
>  
>  if (s->aia_type == VIRT_AIA_TYPE_APLIC) {
>  qemu_fdt_setprop(ms->fdt, aplic_name, "interrupts-extended",
> -aplic_cells, s->soc[socket].num_harts * sizeof(uint32_t) * 2);
> +aplic_cells, num_harts * sizeof(uint32_t) * 2);
>  } else {
>  qemu_fdt_setprop_cell(ms->fdt, aplic_name, "msi-parent", 
> msi_phandle);
>  }
> @@ -662,7 +669,8 @@ static void create_fdt_socket_aplic(RISCVVirtState *s,
>  uint32_t msi_s_phandle,
>  uint32_t *phandle,
>  uint32_t *intc_phandles,
> -uint32_t *aplic_phandles)
> +uint32_t *aplic_phandles,
> +int num_harts)
>  {
>  char *aplic_name;
>  unsigned long aplic_addr;
> @@ -679,7 +687,7 @@ static void create_fdt_socket_aplic(RISCVVirtState *s,
>  create_fdt_one_aplic(s, socket, aplic_addr, 
> memmap[VIRT_APLIC_M].size,
>   msi_m_phandle, intc_phandles,
>   aplic_m_phandle, aplic_s_phandle,
> - true);
> + true, num_harts);
>  }
>  
>  /* S-level APLIC node */
> @@ -688,7 +696,7 @@ static void create_fdt_socket_aplic(RISCVVirtState *s,
>  create_fdt_one_aplic(s, socket, aplic_addr, memmap[VIRT_APLIC_S].size,
>   msi_s_phandle, intc_phandles,
>   aplic_s_phandle, 0,
> - false);
> + false, num_harts);
>  
>  aplic_name = g_strdup_printf("/soc/aplic@%lx", aplic_addr);
>  
> @@ -772,34 +780,48 @@ static void create_fdt_sockets(RISCVVirtState *s, const 
> MemMapEntry *memmap,
>  *msi_pcie_phandle = msi_s_phandle;
>  }
>  
> -phandle_pos = ms->smp.cpus;
> -for (socket = (socket_count - 1); socket >= 0; socket--) {
> -phandle_pos -= s->soc[socket].num_harts;
> +/* KVM AIA only has one APLIC instance */
> +if (virt_use_kvm_aia(s)) {
> +create_fdt_socket_aplic(s, memmap, 0,
> +msi_m_phandle, msi_s_phandle, phandle,
> +

Re: [PATCH v2] hw/mips: Improve the default USB settings in the loongson3-virt machine

2023-07-13 Thread Philippe Mathieu-Daudé


Hi Thomas,

On 21/6/23 09:41, Thomas Huth wrote:

It's possible to compile QEMU without the USB devices (e.g. when using
"--without-default-devices" as option for the "configure" script).
To be still able to run the loongson3-virt machine in default mode with
such a QEMU binary, we have to check here for the availability of the
USB devices first before instantiating them.

Signed-off-by: Thomas Huth 
---
  v2: Use #ifdef instead of runtime check

  hw/mips/loongson3_virt.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/hw/mips/loongson3_virt.c b/hw/mips/loongson3_virt.c
index 216812f660..3094413eea 100644
--- a/hw/mips/loongson3_virt.c
+++ b/hw/mips/loongson3_virt.c
@@ -51,6 +51,7 @@
  #include "sysemu/reset.h"
  #include "sysemu/runstate.h"
  #include "qemu/error-report.h"
+#include CONFIG_DEVICES


I'm a but reluctant to include CONFIG_DEVICES.


+#ifdef CONFIG_USB_OHCI_PCI
  if (defaults_enabled()) {


What about:

  if (defaults_enabled() && object_class_by_name(TYPE_PCI_OHCI)) {


  pci_create_simple(pci_bus, -1, "pci-ohci");
  usb_create_simple(usb_bus_find(-1), "usb-kbd");
  usb_create_simple(usb_bus_find(-1), "usb-tablet");
  }
+#endif

[PATCH v8 1/9] migration: New QAPI type 'MigrateAddress'

2023-07-13 Thread Het Gala

This patch introduces well defined MigrateAddress struct
and its related child objects.

The existing argument of 'migrate' and 'migrate-incoming' QAPI
- 'uri' is of type string. The current implementation follows
double encoding scheme for fetching migration parameters like
'uri' and this is not an ideal design.

Motive for intoducing struct level design is to prevent double
encoding of QAPI arguments, as Qemu should be able to directly
use the QAPI arguments without any level of encoding.

Note: this commit only adds the type, and actual uses comes
in later commits.

Suggested-by: Aravind Retnakaran 
Signed-off-by: Het Gala 
Reviewed-by: Juan Quintela 
Reviewed-by: Daniel P. Berrangé 
Acked-by: Markus Armbruster 
---
 qapi/migration.json | 41 +
 1 file changed, 41 insertions(+)

diff --git a/qapi/migration.json b/qapi/migration.json
index 47dfef0278..b583642c2d 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -1417,6 +1417,47 @@
 ##
 { 'command': 'migrate-continue', 'data': {'state': 'MigrationStatus'} }
 
+##
+# @MigrationAddressType:
+#
+# The migration stream transport mechanisms.
+#
+# @socket: Migrate via socket.
+#
+# @exec: Direct the migration stream to another process.
+#
+# @rdma: Migrate via RDMA.
+#
+# Since 8.1
+##
+{ 'enum': 'MigrationAddressType',
+  'data': ['socket', 'exec', 'rdma'] }
+
+##
+# @MigrationExecCommand:
+#
+# @args: command (list head) and arguments to execute.
+#
+# Since 8.1
+##
+{ 'struct': 'MigrationExecCommand',
+  'data': {'args': [ 'str' ] } }
+
+##
+# @MigrationAddress:
+#
+# Migration endpoint configuration.
+#
+# Since 8.1
+##
+{ 'union': 'MigrationAddress',
+  'base': { 'transport' : 'MigrationAddressType'},
+  'discriminator': 'transport',
+  'data': {
+'socket': 'SocketAddress',
+'exec': 'MigrationExecCommand',
+'rdma': 'InetSocketAddress' } }
+
 ##
 # @migrate:
 #
-- 
2.22.3

[PATCH v8 5/9] migration: convert exec backend to accept MigrateAddress.

2023-07-13 Thread Het Gala

Exec transport backend for 'migrate'/'migrate-incoming' QAPIs accept
new wire protocol of MigrateAddress struct.

It is achived by parsing 'uri' string and storing migration parameters
required for exec connection into strList struct.

Suggested-by: Aravind Retnakaran 
Signed-off-by: Het Gala 
---
 migration/exec.c  | 70 ++-
 migration/exec.h  |  4 +--
 migration/migration.c | 10 +++
 3 files changed, 56 insertions(+), 28 deletions(-)

diff --git a/migration/exec.c b/migration/exec.c
index c4a3293246..8bc321c66b 100644
--- a/migration/exec.c
+++ b/migration/exec.c
@@ -39,21 +39,50 @@ const char *exec_get_cmd_path(void)
 }
 #endif
 
-void exec_start_outgoing_migration(MigrationState *s, const char *command,
+/* provides the length of strList */
+static int
+str_list_length(strList *list)
+{
+int len = 0;
+strList *elem;
+
+for (elem = list; elem != NULL; elem = elem->next) {
+len++;
+}
+
+return len;
+}
+
+static void
+init_exec_array(strList *command, char **argv, Error **errp)
+{
+int i = 0;
+strList *lst;
+
+for (lst = command; lst; lst = lst->next) {
+argv[i++] = lst->value;
+}
+
+argv[i] = NULL;
+return;
+}
+
+void exec_start_outgoing_migration(MigrationState *s, strList *command,
Error **errp)
 {
 QIOChannel *ioc;
 
-#ifdef WIN32
-const char *argv[] = { exec_get_cmd_path(), "/c", command, NULL };
-#else
-const char *argv[] = { "/bin/sh", "-c", command, NULL };
-#endif
+int length = str_list_length(command);
+g_auto(GStrv) argv = (char **) g_new0(const char *, length);
 
-trace_migration_exec_outgoing(command);
-ioc = QIO_CHANNEL(qio_channel_command_new_spawn(argv,
-O_RDWR,
-errp));
+init_exec_array(command, argv, errp);
+g_autofree char *new_command = g_strjoinv(" ", (char **)argv);
+
+trace_migration_exec_outgoing(new_command);
+ioc = QIO_CHANNEL(
+qio_channel_command_new_spawn((const char * const *) argv,
+  O_RDWR,
+  errp));
 if (!ioc) {
 return;
 }
@@ -72,20 +101,21 @@ static gboolean exec_accept_incoming_migration(QIOChannel 
*ioc,
 return G_SOURCE_REMOVE;
 }
 
-void exec_start_incoming_migration(const char *command, Error **errp)
+void exec_start_incoming_migration(strList *command, Error **errp)
 {
 QIOChannel *ioc;
 
-#ifdef WIN32
-const char *argv[] = { exec_get_cmd_path(), "/c", command, NULL };
-#else
-const char *argv[] = { "/bin/sh", "-c", command, NULL };
-#endif
+int length = str_list_length(command);
+g_auto(GStrv) argv = (char **) g_new0(const char *, length);
+
+init_exec_array(command, argv, errp);
+g_autofree char *new_command = g_strjoinv(" ", (char **)argv);
 
-trace_migration_exec_incoming(command);
-ioc = QIO_CHANNEL(qio_channel_command_new_spawn(argv,
-O_RDWR,
-errp));
+trace_migration_exec_incoming(new_command);
+ioc = QIO_CHANNEL(
+qio_channel_command_new_spawn((const char * const *) argv,
+  O_RDWR,
+  errp));
 if (!ioc) {
 return;
 }
diff --git a/migration/exec.h b/migration/exec.h
index 736cd71028..3107f205e3 100644
--- a/migration/exec.h
+++ b/migration/exec.h
@@ -23,8 +23,8 @@
 #ifdef WIN32
 const char *exec_get_cmd_path(void);
 #endif
-void exec_start_incoming_migration(const char *host_port, Error **errp);
+void exec_start_incoming_migration(strList *host_port, Error **errp);
 
-void exec_start_outgoing_migration(MigrationState *s, const char *host_port,
+void exec_start_outgoing_migration(MigrationState *s, strList *host_port,
Error **errp);
 #endif
diff --git a/migration/migration.c b/migration/migration.c
index 2f5bd5df6b..52e2ec3502 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -467,7 +467,6 @@ static bool migrate_uri_parse(const char *uri,
 
 static void qemu_start_incoming_migration(const char *uri, Error **errp)
 {
-const char *p = NULL;
 g_autoptr(MigrationAddress) channel = g_new0(MigrationAddress, 1);
 
 /* URI is not suitable for migration? */
@@ -493,8 +492,8 @@ static void qemu_start_incoming_migration(const char *uri, 
Error **errp)
 } else if (channel->transport == MIGRATION_ADDRESS_TYPE_RDMA) {
 rdma_start_incoming_migration(&channel->u.rdma, errp);
 #endif
-} else if (strstart(uri, "exec:", &p)) {
-exec_start_incoming_migration(p, errp);
+} else if (channel->transport == MIGRATION_ADDRESS_TYPE_EXEC) {
+exec_start_incoming_migration(channel->u.exec.args, errp);
 } else {
 error_setg(errp, "unknown migration pr

[PATCH v8 7/9] migration: modify migration_channels_and_uri_compatible() for new QAPI syntax

2023-07-13 Thread Het Gala

migration_channels_and_uri_compatible() check for transport mechanism
suitable for multifd migration gets executed when the caller calls old
uri syntax. It needs it to be run when using the modern MigrateChannel
QAPI syntax too.

After URI -> 'MigrateChannel' :
migration_channels_and_uri_compatible() ->
migration_channels_and_transport_compatible() passes object as argument
and check for valid transport mechanism.

Suggested-by: Aravind Retnakaran 
Signed-off-by: Het Gala 
---
 migration/migration.c | 25 ++---
 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 65272ef739..62894952ca 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -104,17 +104,20 @@ static bool migration_needs_multiple_sockets(void)
 return migrate_multifd() || migrate_postcopy_preempt();
 }
 
-static bool uri_supports_multi_channels(const char *uri)
+static bool transport_supports_multi_channels(SocketAddress *saddr)
 {
-return strstart(uri, "tcp:", NULL) || strstart(uri, "unix:", NULL) ||
-   strstart(uri, "vsock:", NULL);
+return saddr->type == SOCKET_ADDRESS_TYPE_INET ||
+   saddr->type == SOCKET_ADDRESS_TYPE_UNIX ||
+   saddr->type == SOCKET_ADDRESS_TYPE_VSOCK;
 }
 
 static bool
-migration_channels_and_uri_compatible(const char *uri, Error **errp)
+migration_channels_and_transport_compatible(MigrationAddress *addr,
+Error **errp)
 {
 if (migration_needs_multiple_sockets() &&
-!uri_supports_multi_channels(uri)) {
+(addr->transport == MIGRATION_ADDRESS_TYPE_SOCKET) &&
+!transport_supports_multi_channels(&addr->u.socket)) {
 error_setg(errp, "Migration requires multi-channel URIs (e.g. tcp)");
 return false;
 }
@@ -481,12 +484,12 @@ static void qemu_start_incoming_migration(const char 
*uri, bool has_channels,
 return;
 }
 
-/* URI is not suitable for migration? */
-if (!migration_channels_and_uri_compatible(uri, errp)) {
+if (uri && !migrate_uri_parse(uri, &channel, errp)) {
 return;
 }
 
-if (uri && !migrate_uri_parse(uri, &channel, errp)) {
+/* transport mechanism not suitable for migration? */
+if (!migration_channels_and_transport_compatible(channel, errp)) {
 return;
 }
 
@@ -1717,12 +1720,12 @@ void qmp_migrate(const char *uri, bool has_channels,
 return;
 }
 
-/* URI is not suitable for migration? */
-if (!migration_channels_and_uri_compatible(uri, errp)) {
+if (!migrate_uri_parse(uri, &channel, errp)) {
 return;
 }
 
-if (!migrate_uri_parse(uri, &channel, errp)) {
+/* transport mechanism not suitable for migration? */
+if (!migration_channels_and_transport_compatible(channel, errp)) {
 return;
 }
 
-- 
2.22.3

[PATCH v8 2/9] migration: convert migration 'uri' into 'MigrateAddress'

2023-07-13 Thread Het Gala

This patch parses 'migrate' and 'migrate-incoming' QAPI's 'uri'
string containing migration connection related information
and stores them inside well defined 'MigrateAddress' struct.

Misc: limit line width in exec.c to 80 char recommended by Qemu.

Suggested-by: Aravind Retnakaran 
Signed-off-by: Het Gala 
---
 migration/exec.c  |  4 ++--
 migration/exec.h  |  4 
 migration/migration.c | 54 +++
 3 files changed, 60 insertions(+), 2 deletions(-)

diff --git a/migration/exec.c b/migration/exec.c
index 2bf882bbe1..c4a3293246 100644
--- a/migration/exec.c
+++ b/migration/exec.c
@@ -27,7 +27,6 @@
 #include "qemu/cutils.h"
 
 #ifdef WIN32
-const char *exec_get_cmd_path(void);
 const char *exec_get_cmd_path(void)
 {
 g_autofree char *detected_path = g_new(char, MAX_PATH);
@@ -40,7 +39,8 @@ const char *exec_get_cmd_path(void)
 }
 #endif
 
-void exec_start_outgoing_migration(MigrationState *s, const char *command, 
Error **errp)
+void exec_start_outgoing_migration(MigrationState *s, const char *command,
+   Error **errp)
 {
 QIOChannel *ioc;
 
diff --git a/migration/exec.h b/migration/exec.h
index b210ffde7a..736cd71028 100644
--- a/migration/exec.h
+++ b/migration/exec.h
@@ -19,6 +19,10 @@
 
 #ifndef QEMU_MIGRATION_EXEC_H
 #define QEMU_MIGRATION_EXEC_H
+
+#ifdef WIN32
+const char *exec_get_cmd_path(void);
+#endif
 void exec_start_incoming_migration(const char *host_port, Error **errp);
 
 void exec_start_outgoing_migration(MigrationState *s, const char *host_port,
diff --git a/migration/migration.c b/migration/migration.c
index 91bba630a8..af2ec50061 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -64,6 +64,7 @@
 #include "yank_functions.h"
 #include "sysemu/qtest.h"
 #include "options.h"
+#include "qemu/sockets.h"
 
 static NotifierList migration_state_notifiers =
 NOTIFIER_LIST_INITIALIZER(migration_state_notifiers);
@@ -421,15 +422,63 @@ void migrate_add_address(SocketAddress *address)
   QAPI_CLONE(SocketAddress, address));
 }
 
+static bool migrate_uri_parse(const char *uri,
+  MigrationAddress **channel,
+  Error **errp)
+{
+g_autoptr(MigrationAddress) addr = g_new0(MigrationAddress, 1);
+SocketAddress *saddr = &addr->u.socket;
+InetSocketAddress *isock = &addr->u.rdma;
+strList **tail = &addr->u.exec.args;
+
+if (strstart(uri, "exec:", NULL)) {
+addr->transport = MIGRATION_ADDRESS_TYPE_EXEC;
+#ifdef WIN32
+QAPI_LIST_APPEND(tail, g_strdup(exec_get_cmd_path()));
+#else
+QAPI_LIST_APPEND(tail, g_strdup("/bin/sh"));
+#endif
+QAPI_LIST_APPEND(tail, g_strdup("-c"));
+QAPI_LIST_APPEND(tail, g_strdup(uri + strlen("exec:")));
+} else if (strstart(uri, "rdma:", NULL)) {
+if (inet_parse(isock, uri + strlen("rdma:"), errp)) {
+qapi_free_InetSocketAddress(isock);
+return false;
+}
+addr->transport = MIGRATION_ADDRESS_TYPE_RDMA;
+} else if (strstart(uri, "tcp:", NULL) ||
+strstart(uri, "unix:", NULL) ||
+strstart(uri, "vsock:", NULL) ||
+strstart(uri, "fd:", NULL)) {
+addr->transport = MIGRATION_ADDRESS_TYPE_SOCKET;
+saddr = socket_parse(uri, errp);
+if (!saddr) {
+qapi_free_SocketAddress(saddr);
+return false;
+}
+} else {
+error_setg(errp, "unknown migration protocol: %s", uri);
+return false;
+}
+
+*channel = addr;
+return true;
+}
+
 static void qemu_start_incoming_migration(const char *uri, Error **errp)
 {
 const char *p = NULL;
+g_autoptr(MigrationAddress) channel = g_new0(MigrationAddress, 1);
 
 /* URI is not suitable for migration? */
 if (!migration_channels_and_uri_compatible(uri, errp)) {
 return;
 }
 
+if (uri && !migrate_uri_parse(uri, &channel, errp)) {
+return;
+}
+
 qapi_event_send_migration(MIGRATION_STATUS_SETUP);
 if (strstart(uri, "tcp:", &p) ||
 strstart(uri, "unix:", NULL) ||
@@ -1641,12 +1690,17 @@ void qmp_migrate(const char *uri, bool has_blk, bool 
blk,
 Error *local_err = NULL;
 MigrationState *s = migrate_get_current();
 const char *p = NULL;
+g_autoptr(MigrationAddress) channel = g_new0(MigrationAddress, 1);
 
 /* URI is not suitable for migration? */
 if (!migration_channels_and_uri_compatible(uri, errp)) {
 return;
 }
 
+if (!migrate_uri_parse(uri, &channel, &local_err)) {
+return;
+}
+
 resume_requested = has_resume && resume;
 if (!migrate_prepare(s, has_blk && blk, has_inc && inc,
  resume_requested, errp)) {
-- 
2.22.3

[PATCH v8 9/9] migration: Add test case for modified QAPI syntax

2023-07-13 Thread Het Gala

Add multifd tcp common test case for new QAPI syntax defined.

Suggested-by: Aravind Retnakaran 
Signed-off-by: Het Gala 
---
 tests/qtest/migration-test.c | 45 
 1 file changed, 45 insertions(+)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index efa8c729db..786e6bbe8b 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -2190,6 +2190,32 @@ test_migrate_precopy_tcp_multifd_start_common(QTestState 
*from,
 return NULL;
 }
 
+static void *
+test_migrate_precopy_tcp_multifd_start_new_syntax_common(QTestState *from,
+ QTestState *to,
+ const char *method)
+{
+migrate_set_parameter_int(from, "multifd-channels", 16);
+migrate_set_parameter_int(to, "multifd-channels", 16);
+
+migrate_set_parameter_str(from, "multifd-compression", method);
+migrate_set_parameter_str(to, "multifd-compression", method);
+
+migrate_set_capability(from, "multifd", true);
+migrate_set_capability(to, "multifd", true);
+
+/* Start incoming migration from the 1st socket */
+qtest_qmp_assert_success(to, "{ 'execute': 'migrate-incoming',"
+ "  'arguments': { "
+ "  'channels': [ { 'channeltype': 'main',"
+ "  'addr': { 'transport': 'socket',"
+ "'type': 'inet',"
+ "'host': '127.0.0.1',"
+ "'port': '0' } } ] } }");
+
+return NULL;
+}
+
 static void *
 test_migrate_precopy_tcp_multifd_start(QTestState *from,
QTestState *to)
@@ -2197,6 +2223,14 @@ test_migrate_precopy_tcp_multifd_start(QTestState *from,
 return test_migrate_precopy_tcp_multifd_start_common(from, to, "none");
 }
 
+static void *
+test_migrate_precopy_tcp_multifd_new_syntax_start(QTestState *from,
+  QTestState *to)
+{
+return test_migrate_precopy_tcp_multifd_start_new_syntax_common(from,
+  to, "none");
+}
+
 static void *
 test_migrate_precopy_tcp_multifd_zlib_start(QTestState *from,
 QTestState *to)
@@ -2228,6 +2262,15 @@ static void test_multifd_tcp_none(void)
 test_precopy_common(&args);
 }
 
+static void test_multifd_tcp_new_syntax_none(void)
+{
+MigrateCommon args = {
+.listen_uri = "defer",
+.start_hook = test_migrate_precopy_tcp_multifd_new_syntax_start,
+};
+test_precopy_common(&args);
+}
+
 static void test_multifd_tcp_zlib(void)
 {
 MigrateCommon args = {
@@ -2916,6 +2959,8 @@ int main(int argc, char **argv)
 }
 qtest_add_func("/migration/multifd/tcp/plain/none",
test_multifd_tcp_none);
+qtest_add_func("/migration/multifd/tcp/plain/none",
+   test_multifd_tcp_new_syntax_none);
 /*
  * This test is flaky and sometimes fails in CI and otherwise:
  * don't run unless user opts in via environment variable.
-- 
2.22.3

[PATCH v8 8/9] migration: Implement MigrateChannelList to migration flow.

2023-07-13 Thread Het Gala

Integrate MigrateChannelList with all transport backends
(socket, exec and rdma) for both src and dest migration
endpoints.

Suggested-by: Aravind Retnakaran 
Signed-off-by: Het Gala 
---
 migration/migration.c | 77 ---
 migration/socket.c|  5 ++-
 2 files changed, 54 insertions(+), 28 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 62894952ca..43c7e4b525 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -426,9 +426,10 @@ void migrate_add_address(SocketAddress *address)
 }
 
 static bool migrate_uri_parse(const char *uri,
-  MigrationAddress **channel,
+  MigrationChannel **channel,
   Error **errp)
 {
+g_autoptr(MigrationChannel) val = g_new0(MigrationChannel, 1);
 g_autoptr(MigrationAddress) addr = g_new0(MigrationAddress, 1);
 SocketAddress *saddr = &addr->u.socket;
 InetSocketAddress *isock = &addr->u.rdma;
@@ -464,7 +465,9 @@ static bool migrate_uri_parse(const char *uri,
 return false;
 }
 
-*channel = addr;
+val->channel_type = MIGRATION_CHANNEL_TYPE_MAIN;
+val->addr = addr;
+*channel = val;
 return true;
 }
 
@@ -472,7 +475,8 @@ static void qemu_start_incoming_migration(const char *uri, 
bool has_channels,
   MigrationChannelList *channels,
   Error **errp)
 {
-g_autoptr(MigrationAddress) channel = g_new0(MigrationAddress, 1);
+g_autoptr(MigrationChannel) channel = g_new0(MigrationChannel, 1);
+g_autoptr(MigrationAddress) addr = g_new0(MigrationAddress, 1);
 
 /*
  * Having preliminary checks for uri and channel
@@ -482,20 +486,29 @@ static void qemu_start_incoming_migration(const char 
*uri, bool has_channels,
"exclusive; exactly one of the two should be present in "
"'migrate-incoming' qmp command ");
 return;
-}
-
-if (uri && !migrate_uri_parse(uri, &channel, errp)) {
-return;
+} else if (channels) {
+/* To verify that Migrate channel list has only item */
+if (channels->next) {
+error_setg(errp, "Channel list has more than one entries");
+return;
+}
+channel = channels->value;
+addr = channel->addr;
+} else {
+/* caller uses the old URI syntax */
+if (uri && !migrate_uri_parse(uri, &channel, errp)) {
+return;
+}
 }
 
 /* transport mechanism not suitable for migration? */
-if (!migration_channels_and_transport_compatible(channel, errp)) {
+if (!migration_channels_and_transport_compatible(addr, errp)) {
 return;
 }
 
 qapi_event_send_migration(MIGRATION_STATUS_SETUP);
-if (channel->transport == MIGRATION_ADDRESS_TYPE_SOCKET) {
-SocketAddress *saddr = &channel->u.socket;
+if (addr->transport == MIGRATION_ADDRESS_TYPE_SOCKET) {
+SocketAddress *saddr = &addr->u.socket;
 if (saddr->type == SOCKET_ADDRESS_TYPE_INET ||
 saddr->type == SOCKET_ADDRESS_TYPE_UNIX ||
 saddr->type == SOCKET_ADDRESS_TYPE_VSOCK) {
@@ -504,11 +517,11 @@ static void qemu_start_incoming_migration(const char 
*uri, bool has_channels,
 fd_start_incoming_migration(saddr->u.fd.str, errp);
 }
 #ifdef CONFIG_RDMA
-} else if (channel->transport == MIGRATION_ADDRESS_TYPE_RDMA) {
-rdma_start_incoming_migration(&channel->u.rdma, errp);
-#endif
-} else if (channel->transport == MIGRATION_ADDRESS_TYPE_EXEC) {
-exec_start_incoming_migration(channel->u.exec.args, errp);
+} else if (addr->transport == MIGRATION_ADDRESS_TYPE_RDMA) {
+rdma_start_incoming_migration(&addr->u.rdma, errp);
+ #endif
+} else if (addr->transport == MIGRATION_ADDRESS_TYPE_EXEC) {
+exec_start_incoming_migration(addr->u.exec.args, errp);
 } else {
 error_setg(errp, "unknown migration protocol: %s", uri);
 }
@@ -1708,7 +1721,8 @@ void qmp_migrate(const char *uri, bool has_channels,
 bool resume_requested;
 Error *local_err = NULL;
 MigrationState *s = migrate_get_current();
-g_autoptr(MigrationAddress) channel = g_new0(MigrationAddress, 1);
+g_autoptr(MigrationChannel) channel = g_new0(MigrationChannel, 1);
+g_autoptr(MigrationAddress) addr = g_new0(MigrationAddress, 1);
 
 /*
  * Having preliminary checks for uri and channel
@@ -1718,14 +1732,23 @@ void qmp_migrate(const char *uri, bool has_channels,
"exclusive; exactly one of the two should be present in "
"'migrate' qmp command ");
 return;
-}
-
-if (!migrate_uri_parse(uri, &channel, errp)) {
-return;
+} else if (channels) {
+/* To verify that Migrate channel list has only item */
+if (channels->next) {
+error_setg(errp, "Channel list has mo

[PATCH v8 4/9] migration: convert rdma backend to accept MigrateAddress

2023-07-13 Thread Het Gala

RDMA based transport backend for 'migrate'/'migrate-incoming' QAPIs
accept new wire protocol of MigrateAddress struct.

It is achived by parsing 'uri' string and storing migration parameters
required for RDMA connection into well defined InetSocketAddress struct.

Suggested-by: Aravind Retnakaran 
Signed-off-by: Het Gala 
---
 migration/migration.c |  8 
 migration/rdma.c  | 34 --
 migration/rdma.h  |  6 --
 3 files changed, 20 insertions(+), 28 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 8108d4248f..2f5bd5df6b 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -490,8 +490,8 @@ static void qemu_start_incoming_migration(const char *uri, 
Error **errp)
 fd_start_incoming_migration(saddr->u.fd.str, errp);
 }
 #ifdef CONFIG_RDMA
-} else if (strstart(uri, "rdma:", &p)) {
-rdma_start_incoming_migration(p, errp);
+} else if (channel->transport == MIGRATION_ADDRESS_TYPE_RDMA) {
+rdma_start_incoming_migration(&channel->u.rdma, errp);
 #endif
 } else if (strstart(uri, "exec:", &p)) {
 exec_start_incoming_migration(p, errp);
@@ -1727,8 +1727,8 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
 fd_start_outgoing_migration(s, saddr->u.fd.str, &local_err);
 }
 #ifdef CONFIG_RDMA
-} else if (strstart(uri, "rdma:", &p)) {
-rdma_start_outgoing_migration(s, p, &local_err);
+} else if (channel->transport == MIGRATION_ADDRESS_TYPE_RDMA) {
+rdma_start_outgoing_migration(s, &channel->u.rdma, &local_err);
 #endif
 } else if (strstart(uri, "exec:", &p)) {
 exec_start_outgoing_migration(s, p, &local_err);
diff --git a/migration/rdma.c b/migration/rdma.c
index dd1c039e6c..4d64fae492 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -319,7 +319,6 @@ typedef struct RDMALocalBlocks {
 typedef struct RDMAContext {
 char *host;
 int port;
-char *host_port;
 
 RDMAWorkRequestData wr_data[RDMA_WRID_MAX];
 
@@ -2455,9 +2454,7 @@ static void qemu_rdma_cleanup(RDMAContext *rdma)
 rdma->channel = NULL;
 }
 g_free(rdma->host);
-g_free(rdma->host_port);
 rdma->host = NULL;
-rdma->host_port = NULL;
 }
 
 
@@ -2739,28 +2736,17 @@ static void qemu_rdma_return_path_dest_init(RDMAContext 
*rdma_return_path,
 rdma_return_path->is_return_path = true;
 }
 
-static void *qemu_rdma_data_init(const char *host_port, Error **errp)
+static void *qemu_rdma_data_init(InetSocketAddress *saddr, Error **errp)
 {
 RDMAContext *rdma = NULL;
-InetSocketAddress *addr;
 
-if (host_port) {
+if (saddr) {
 rdma = g_new0(RDMAContext, 1);
 rdma->current_index = -1;
 rdma->current_chunk = -1;
 
-addr = g_new(InetSocketAddress, 1);
-if (!inet_parse(addr, host_port, NULL)) {
-rdma->port = atoi(addr->port);
-rdma->host = g_strdup(addr->host);
-rdma->host_port = g_strdup(host_port);
-} else {
-ERROR(errp, "bad RDMA migration address '%s'", host_port);
-g_free(rdma);
-rdma = NULL;
-}
-
-qapi_free_InetSocketAddress(addr);
+rdma->host = g_strdup(saddr->host);
+rdma->port = atoi(saddr->port);
 }
 
 return rdma;
@@ -3359,6 +3345,7 @@ static int qemu_rdma_accept(RDMAContext *rdma)
 .private_data_len = sizeof(cap),
  };
 RDMAContext *rdma_return_path = NULL;
+g_autoptr(InetSocketAddress) isock = g_new0(InetSocketAddress, 1);
 struct rdma_cm_event *cm_event;
 struct ibv_context *verbs;
 int ret = -EINVAL;
@@ -3374,13 +3361,16 @@ static int qemu_rdma_accept(RDMAContext *rdma)
 goto err_rdma_dest_wait;
 }
 
+isock->host = rdma->host;
+isock->port = g_strdup_printf("%d", rdma->port);
+
 /*
  * initialize the RDMAContext for return path for postcopy after first
  * connection request reached.
  */
 if ((migrate_postcopy() || migrate_return_path())
 && !rdma->is_return_path) {
-rdma_return_path = qemu_rdma_data_init(rdma->host_port, NULL);
+rdma_return_path = qemu_rdma_data_init(isock, NULL);
 if (rdma_return_path == NULL) {
 rdma_ack_cm_event(cm_event);
 goto err_rdma_dest_wait;
@@ -4113,7 +4103,8 @@ static void rdma_accept_incoming_migration(void *opaque)
 }
 }
 
-void rdma_start_incoming_migration(const char *host_port, Error **errp)
+void rdma_start_incoming_migration(InetSocketAddress *host_port,
+   Error **errp)
 {
 int ret;
 RDMAContext *rdma;
@@ -4159,13 +4150,12 @@ err:
 error_propagate(errp, local_err);
 if (rdma) {
 g_free(rdma->host);
-g_free(rdma->host_port);
 }
 g_free(rdma);
 }
 
 void rdma_start_outgoing_migration(void *opaque,
-

[PATCH v8 0/9] migration: Modify 'migrate' and 'migrate-incoming' QAPI commands for migration

2023-07-13 Thread Het Gala

This is v8 patchset of modified 'migrate' and 'migrate-incoming' QAPI design
for upstream review.

Would like to thank all the maintainers that actively participated in the v7
patchset discussion and gave insightful suggestions to improve the patches.


Link to previous upstream community patchset links:
v1: https://lists.gnu.org/archive/html/qemu-devel/2022-12/msg04339.html
v2: https://lists.gnu.org/archive/html/qemu-devel/2023-02/msg02106.html
v3: https://lists.gnu.org/archive/html/qemu-devel/2023-02/msg02473.html
v4: https://lists.gnu.org/archive/html/qemu-devel/2023-05/msg03064.html
v5: https://lists.gnu.org/archive/html/qemu-devel/2023-05/msg04845.html
v6: https://lists.gnu.org/archive/html/qemu-devel/2023-06/msg01251.html
v7: https://lists.gnu.org/archive/html/qemu-devel/2023-07/msg02027.html

v7 -> v8 changelog:
---
- Formatting, improvemnt around migration qapi documentation and commit
  message descriptions

Abstract:
-

Current QAPI 'migrate' command design (for initiating a migration
stream) contains information regarding different migrate transport mechanism
(tcp / unix / exec), dest-host IP address, and binding port number in form of
a string. Thus the design does seem to have some design issues. Some of the
issues, stated below are:

1. Use of string URIs is a data encoding scheme within a data encoding scheme.
   QEMU code should directly be able to work with the results from QAPI,
   without resorting to do a second level of parsing (eg. socket_parse()).
2. For features / parameters related to migration, the migration tunables needs
   to be defined and updated upfront. For example, 'migrate-set-capability'
   and 'migrate-set-parameter' is required to enable multifd capability and
   multifd-number of channels respectively. Instead, 'Multifd-channels' can
   directly be represented as a single additional parameter to 'migrate'
   QAPI. 'migrate-set-capability' and 'migrate-set-parameter' commands could
   be used for runtime tunables that need setting after migration has already
   started.

The current patchset focuses on solving the first problem of multi-level
encoding of URIs. The patch defines 'migrate' command as a QAPI discriminated
union for the various transport backends (like socket, exec and rdma), and on
basis of transport backends, different migration parameters are defined.

(uri) string -->  (channel) Channel-type
Transport-type
Migration parameters based on transport type
--

Het Gala (9):
  migration: New QAPI type 'MigrateAddress'
  migration: convert migration 'uri' into 'MigrateAddress'
  migration: convert socket backend to accept MigrateAddress
  migration: convert rdma backend to accept MigrateAddress
  migration: convert exec backend to accept MigrateAddress.
  migration: New migrate and migrate-incoming argument 'channels'
  migration: modify migration_channels_and_uri_compatible() for new QAPI
syntax
  migration: Implement MigrateChannelList to migration flow.
  migration: Add test case for modified QAPI syntax

 migration/exec.c   |  72 +
 migration/exec.h   |   8 +-
 migration/migration-hmp-cmds.c |  16 ++-
 migration/migration.c  | 182 ++---
 migration/rdma.c   |  34 +++---
 migration/rdma.h   |   6 +-
 migration/socket.c |  39 ++-
 migration/socket.h |   7 +-
 qapi/migration.json| 150 ++-
 softmmu/vl.c   |   2 +-
 tests/qtest/migration-test.c   |  45 
 11 files changed, 435 insertions(+), 126 deletions(-)

-- 
2.22.3

[PATCH v8 6/9] migration: New migrate and migrate-incoming argument 'channels'

2023-07-13 Thread Het Gala

MigrateChannelList allows to connect accross multiple interfaces.
Add MigrateChannelList struct as argument to migration QAPIs.

We plan to include multiple channels in future, to connnect
multiple interfaces. Hence, we choose 'MigrateChannelList'
as the new argument over 'MigrateChannel' to make migration
QAPIs future proof.

For current series, limit the size of MigrateChannelList
to single element (single interface) as runtime check.

Suggested-by: Aravind Retnakaran 
Signed-off-by: Het Gala 
Acked-by: Markus Armbruster 
---
 migration/migration-hmp-cmds.c |  16 +++--
 migration/migration.c  |  34 --
 qapi/migration.json| 109 -
 softmmu/vl.c   |   2 +-
 4 files changed, 147 insertions(+), 14 deletions(-)

diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
index 9885d7c9f7..5f04598202 100644
--- a/migration/migration-hmp-cmds.c
+++ b/migration/migration-hmp-cmds.c
@@ -423,10 +423,12 @@ void hmp_migrate_incoming(Monitor *mon, const QDict 
*qdict)
 {
 Error *err = NULL;
 const char *uri = qdict_get_str(qdict, "uri");
+MigrationChannelList *caps = NULL;
+g_autoptr(MigrationChannel) channel = g_new0(MigrationChannel, 1);
 
-qmp_migrate_incoming(uri, &err);
-
-hmp_handle_error(mon, err);
+QAPI_LIST_PREPEND(caps, channel);
+qmp_migrate_incoming(uri, false, caps, &err);
+qapi_free_MigrationChannelList(caps);
 }
 
 void hmp_migrate_recover(Monitor *mon, const QDict *qdict)
@@ -704,9 +706,13 @@ void hmp_migrate(Monitor *mon, const QDict *qdict)
 bool resume = qdict_get_try_bool(qdict, "resume", false);
 const char *uri = qdict_get_str(qdict, "uri");
 Error *err = NULL;
+MigrationChannelList *caps = NULL;
+g_autoptr(MigrationChannel) channel = g_new0(MigrationChannel, 1);
 
-qmp_migrate(uri, !!blk, blk, !!inc, inc,
-false, false, true, resume, &err);
+QAPI_LIST_PREPEND(caps, channel);
+qmp_migrate(uri, false, caps, !!blk, blk, !!inc, inc,
+ false, false, true, resume, &err);
+qapi_free_MigrationChannelList(caps);
 if (hmp_handle_error(mon, err)) {
 return;
 }
diff --git a/migration/migration.c b/migration/migration.c
index 52e2ec3502..65272ef739 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -465,10 +465,22 @@ static bool migrate_uri_parse(const char *uri,
 return true;
 }
 
-static void qemu_start_incoming_migration(const char *uri, Error **errp)
+static void qemu_start_incoming_migration(const char *uri, bool has_channels,
+  MigrationChannelList *channels,
+  Error **errp)
 {
 g_autoptr(MigrationAddress) channel = g_new0(MigrationAddress, 1);
 
+/*
+ * Having preliminary checks for uri and channel
+ */
+if (uri && has_channels) {
+error_setg(errp, "'uri' and 'channels' arguments are mutually "
+   "exclusive; exactly one of the two should be present in "
+   "'migrate-incoming' qmp command ");
+return;
+}
+
 /* URI is not suitable for migration? */
 if (!migration_channels_and_uri_compatible(uri, errp)) {
 return;
@@ -1496,7 +1508,8 @@ void migrate_del_blocker(Error *reason)
 migration_blockers = g_slist_remove(migration_blockers, reason);
 }
 
-void qmp_migrate_incoming(const char *uri, Error **errp)
+void qmp_migrate_incoming(const char *uri, bool has_channels,
+  MigrationChannelList *channels, Error **errp)
 {
 Error *local_err = NULL;
 static bool once = true;
@@ -1514,7 +1527,7 @@ void qmp_migrate_incoming(const char *uri, Error **errp)
 return;
 }
 
-qemu_start_incoming_migration(uri, &local_err);
+qemu_start_incoming_migration(uri, has_channels, channels, &local_err);
 
 if (local_err) {
 yank_unregister_instance(MIGRATION_YANK_INSTANCE);
@@ -1550,7 +1563,7 @@ void qmp_migrate_recover(const char *uri, Error **errp)
  * only re-setup the migration stream and poke existing migration
  * to continue using that newly established channel.
  */
-qemu_start_incoming_migration(uri, errp);
+qemu_start_incoming_migration(uri, false, NULL, errp);
 }
 
 void qmp_migrate_pause(Error **errp)
@@ -1684,7 +1697,8 @@ static bool migrate_prepare(MigrationState *s, bool blk, 
bool blk_inc,
 return true;
 }
 
-void qmp_migrate(const char *uri, bool has_blk, bool blk,
+void qmp_migrate(const char *uri, bool has_channels,
+ MigrationChannelList *channels, bool has_blk, bool blk,
  bool has_inc, bool inc, bool has_detach, bool detach,
  bool has_resume, bool resume, Error **errp)
 {
@@ -1693,6 +1707,16 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
 MigrationState *s = migrate_get_current();
 g_autoptr(MigrationAddress) channel = g_new0(MigrationAddress, 1);
 
+

[PATCH v8 3/9] migration: convert socket backend to accept MigrateAddress

2023-07-13 Thread Het Gala

Socket transport backend for 'migrate'/'migrate-incoming' QAPIs accept
new wire protocol of MigrateAddress struct.

It is achived by parsing 'uri' string and storing migration parameters
required for socket connection into well defined SocketAddress struct.

Suggested-by: Aravind Retnakaran 
Signed-off-by: Het Gala 
---
 migration/migration.c | 32 +++-
 migration/socket.c| 34 +-
 migration/socket.h|  7 ---
 3 files changed, 28 insertions(+), 45 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index af2ec50061..8108d4248f 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -480,18 +480,21 @@ static void qemu_start_incoming_migration(const char 
*uri, Error **errp)
 }
 
 qapi_event_send_migration(MIGRATION_STATUS_SETUP);
-if (strstart(uri, "tcp:", &p) ||
-strstart(uri, "unix:", NULL) ||
-strstart(uri, "vsock:", NULL)) {
-socket_start_incoming_migration(p ? p : uri, errp);
+if (channel->transport == MIGRATION_ADDRESS_TYPE_SOCKET) {
+SocketAddress *saddr = &channel->u.socket;
+if (saddr->type == SOCKET_ADDRESS_TYPE_INET ||
+saddr->type == SOCKET_ADDRESS_TYPE_UNIX ||
+saddr->type == SOCKET_ADDRESS_TYPE_VSOCK) {
+socket_start_incoming_migration(saddr, errp);
+} else if (saddr->type == SOCKET_ADDRESS_TYPE_FD) {
+fd_start_incoming_migration(saddr->u.fd.str, errp);
+}
 #ifdef CONFIG_RDMA
 } else if (strstart(uri, "rdma:", &p)) {
 rdma_start_incoming_migration(p, errp);
 #endif
 } else if (strstart(uri, "exec:", &p)) {
 exec_start_incoming_migration(p, errp);
-} else if (strstart(uri, "fd:", &p)) {
-fd_start_incoming_migration(p, errp);
 } else {
 error_setg(errp, "unknown migration protocol: %s", uri);
 }
@@ -1697,7 +1700,7 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
 return;
 }
 
-if (!migrate_uri_parse(uri, &channel, &local_err)) {
+if (!migrate_uri_parse(uri, &channel, errp)) {
 return;
 }
 
@@ -1714,18 +1717,21 @@ void qmp_migrate(const char *uri, bool has_blk, bool 
blk,
 }
 }
 
-if (strstart(uri, "tcp:", &p) ||
-strstart(uri, "unix:", NULL) ||
-strstart(uri, "vsock:", NULL)) {
-socket_start_outgoing_migration(s, p ? p : uri, &local_err);
+if (channel->transport == MIGRATION_ADDRESS_TYPE_SOCKET) {
+SocketAddress *saddr = &channel->u.socket;
+if (saddr->type == SOCKET_ADDRESS_TYPE_INET ||
+saddr->type == SOCKET_ADDRESS_TYPE_UNIX ||
+saddr->type == SOCKET_ADDRESS_TYPE_VSOCK) {
+socket_start_outgoing_migration(s, saddr, &local_err);
+} else if (saddr->type == SOCKET_ADDRESS_TYPE_FD) {
+fd_start_outgoing_migration(s, saddr->u.fd.str, &local_err);
+}
 #ifdef CONFIG_RDMA
 } else if (strstart(uri, "rdma:", &p)) {
 rdma_start_outgoing_migration(s, p, &local_err);
 #endif
 } else if (strstart(uri, "exec:", &p)) {
 exec_start_outgoing_migration(s, p, &local_err);
-} else if (strstart(uri, "fd:", &p)) {
-fd_start_outgoing_migration(s, p, &local_err);
 } else {
 if (!resume_requested) {
 yank_unregister_instance(MIGRATION_YANK_INSTANCE);
diff --git a/migration/socket.c b/migration/socket.c
index 1b6f5baefb..8e7430b266 100644
--- a/migration/socket.c
+++ b/migration/socket.c
@@ -108,10 +108,9 @@ out:
 object_unref(OBJECT(sioc));
 }
 
-static void
-socket_start_outgoing_migration_internal(MigrationState *s,
- SocketAddress *saddr,
- Error **errp)
+void socket_start_outgoing_migration(MigrationState *s,
+ SocketAddress *saddr,
+ Error **errp)
 {
 QIOChannelSocket *sioc = qio_channel_socket_new();
 struct SocketConnectData *data = g_new0(struct SocketConnectData, 1);
@@ -135,18 +134,6 @@ socket_start_outgoing_migration_internal(MigrationState *s,
  NULL);
 }
 
-void socket_start_outgoing_migration(MigrationState *s,
- const char *str,
- Error **errp)
-{
-Error *err = NULL;
-SocketAddress *saddr = socket_parse(str, &err);
-if (!err) {
-socket_start_outgoing_migration_internal(s, saddr, &err);
-}
-error_propagate(errp, err);
-}
-
 static void socket_accept_incoming_migration(QIONetListener *listener,
  QIOChannelSocket *cioc,
  gpointer opaque)
@@ -172,9 +159,8 @@ socket_incoming_migration_end(void *opaque)
 object_unref(OBJECT(listener));
 }
 
-static void
-socket_start_incoming_migration_internal(SocketAddress *saddr,
-

Re: [PATCH v2 0/3] Add VIRTIO sound card

2023-07-13 Thread Manos Pitsidianakis


Ping

Patch series on patchew: 
https://patchew.org/QEMU/cover.1686238728.git.manos.pitsidiana...@linaro.org/


Patch series on lore:
https://lore.kernel.org/qemu-devel/cover.1686238728.git.manos.pitsidiana...@linaro.org/


On Thu, 08 Jun 2023 18:56, Manos Pitsidianakis  
wrote:
This patch series adds an audio device implementing the recent virtio 
sound spec (1.2) and a corresponding PCI wrapper device.


Main differences with v1 patch 
<20230526204845.673031-1-manos.pitsidiana...@linaro.org>:


- Split virtio-snd and virtio-snd-pci devices to two commits
- Added audio capture support

Known problems (On pipewire at least):

- Stereo recording results in one channel with the recording + a ticking 
 kind of artifact and one channel that has no artifacts, just the 
 recording.


Manos Pitsidianakis (3):
 Add virtio-sound device
 Add virtio-sound-pci device
 Implement audio capture in virtio-snd device

MAINTAINERS|6 +
hw/virtio/Kconfig  |5 +
hw/virtio/meson.build  |2 +
hw/virtio/trace-events |   22 +
hw/virtio/virtio-snd-pci.c |  102 +++
hw/virtio/virtio-snd.c | 1290 
include/hw/pci/pci.h   |1 +
include/hw/virtio/virtio-snd.h |  194 +
softmmu/qdev-monitor.c |1 +
9 files changed, 1623 insertions(+)
create mode 100644 hw/virtio/virtio-snd-pci.c
create mode 100644 hw/virtio/virtio-snd.c
create mode 100644 include/hw/virtio/virtio-snd.h

Range-diff against v1:
1:  652c7d2d01 ! 1:  95d564fc1f Add virtio-sound and virtio-sound-pci devices
   @@ Metadata
Author: Manos Pitsidianakis 

 ## Commit message ##

   -Add virtio-sound and virtio-sound-pci devices
   +Add virtio-sound device

This patch adds an audio device implementing the recent virtio sound

   -spec (1.2) and a corresponding PCI wrapper device.
   +spec (1.2).

PCM functionality is implemented, and jack[0], chmaps[1] messages are

   -at the moment ignored.
   -
   -To test this, you'll need a >6.0 kernel compiled with the virtio-snd
   -flag enabled, which distros have off by default.
   -
   -Use with following flags in the invocation:
   -
   -  -device virtio-sound-pci,disable-legacy=on
   -
   -And an audio backend listed with `-audio driver=help` that works on
   -your host machine, e.g.:
   -
   -Pulseaudio:
   -  -audio driver=pa,model=virtio-sound,server=/run/user/1000/pulse/native
   -sdl:
   -  -audio driver=sdl,model=virtio-sound
   -coreaudio:
   -  -audio driver=coreaudio,model=virtio-sound
   -etc.
   -
   -You can use speaker-test from alsa-tools to play noise, sines, or
   -WAV files.
   +at the moment left unimplemented.

PS2: This patch was based on a draft patch posted by OpenSynergy in 2019. [2]

   @@ Commit message

Signed-off-by: Igor Skalkin 
Signed-off-by: Anton Yakovlev 

   + ## MAINTAINERS ##

   +@@ MAINTAINERS: F: hw/virtio/virtio-mem-pci.h
   + F: hw/virtio/virtio-mem-pci.c
   + F: include/hw/virtio/virtio-mem.h
   + 
   ++virtio-snd

   ++M: Manos Pitsidianakis 
   ++S: Supported
   ++F: hw/virtio/virtio-snd*.c
   ++F: include/hw/virtio/virtio-snd.h
   ++
   + nvme
   + M: Keith Busch 
   + M: Klaus Jensen 
   +
 ## hw/virtio/Kconfig ##
@@ hw/virtio/Kconfig: config VIRTIO_PCI
 depends on PCI
   @@ hw/virtio/Kconfig: config VIRTIO_PCI
 select VIRTIO

 ## hw/virtio/meson.build ##

   -@@ hw/virtio/meson.build: virtio_pci_ss.add(when: 'CONFIG_VIRTIO_SERIAL', 
if_true: files('virtio-serial-pc
   - virtio_pci_ss.add(when: 'CONFIG_VIRTIO_PMEM', if_true: 
files('virtio-pmem-pci.c'))
   - virtio_pci_ss.add(when: 'CONFIG_VIRTIO_IOMMU', if_true: 
files('virtio-iommu-pci.c'))
   - virtio_pci_ss.add(when: 'CONFIG_VIRTIO_MEM', if_true: 
files('virtio-mem-pci.c'))
   -+virtio_pci_ss.add(when: 'CONFIG_VIRTIO_SND', if_true: 
files('virtio-snd-pci.c'))
   - virtio_pci_ss.add(when: 'CONFIG_VHOST_VDPA_DEV', if_true: 
files('vdpa-dev-pci.c'))
   - 
   - specific_virtio_ss.add_all(when: 'CONFIG_VIRTIO_PCI', if_true: virtio_pci_ss)

@@ hw/virtio/meson.build: softmmu_ss.add(when: 'CONFIG_VIRTIO', if_false: 
files('virtio-stub.c'))
 softmmu_ss.add(when: 'CONFIG_ALL', if_true: files('vhost-stub.c'))
 softmmu_ss.add(when: 'CONFIG_ALL', if_true: files('virtio-stub.c'))
   @@ hw/virtio/trace-events: virtio_pmem_flush_done(int type) "fsync return=%d"
+virtio_snd_handle_pcm_info(int stream) "VIRTIO_SND_R_PCM_INFO called for stream 
%d"
+virtio_snd_handle_pcm_set_params(int stream) "VIRTIO_SND_PCM_SET_PARAMS called 
for stream %d"
+virtio_snd_handle_pcm_start_stop(const char *code, int stream) "%s called for 
stream %d"
   -+virtio_snd_handle_pcm_release(int stream) "VIRTIO_SND_PCM_RELEASE called for 
stream %id"
   ++virtio_snd_handle_pcm_release(int stream) "VIRTIO_SND_PCM_RELEASE called for 
stream %d"
+v

Re: [PATCH 2/2] migration: Make it clear that qemu_file_set_error() needs a negative value

2023-07-13 Thread Peter Maydell

On Thu, 6 Jul 2023 at 20:52, Fabiano Rosas  wrote:
>
> The convention in qemu-file.c is to return a negative value on
> error.
>
> The only place that could use qemu_file_set_error() to store a
> positive value to f->last_error was vmstate_save() which has been
> fixed in the previous patch.
>
> bdrv_inactivate_all() already returns a negative value on error.
>
> Document that qemu_file_set_error() needs -errno and alter the callers
> to check ret < 0.
>
> Signed-off-by: Fabiano Rosas 
> ---
>  migration/qemu-file.c | 2 ++
>  migration/savevm.c| 6 +++---
>  2 files changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/migration/qemu-file.c b/migration/qemu-file.c
> index acc282654a..8276bac248 100644
> --- a/migration/qemu-file.c
> +++ b/migration/qemu-file.c
> @@ -222,6 +222,8 @@ int qemu_file_get_error(QEMUFile *f)
>
>  /*
>   * Set the last error for stream f
> + *
> + * The error ('ret') should be in -errno format.
>   */
>  void qemu_file_set_error(QEMUFile *f, int ret)
>  {
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 95c2abf47c..f3c303ab74 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1249,7 +1249,7 @@ void qemu_savevm_state_setup(QEMUFile *f)
>  QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
>  if (se->vmsd && se->vmsd->early_setup) {
>  ret = vmstate_save(f, se, ms->vmdesc);
> -if (ret) {
> +if (ret < 0) {
>  qemu_file_set_error(f, ret);

You say qemu_file_set_error() should take an errno,
but vmstate_save() doesn't return one. It will directly
return whatever the VMStateInfo put, pre_save, etc hooks
return, which isn't necessarily an errno. (Specifically,
patch 1 in this series makes a .put hook return -1,
rather than an errno. I'm guessing other implementations
might too, though it's a bit hard to find them. A
coccinelle script could probably locate them.)

The de-facto API for all these methods in the VMStateInfo
is "0 on success, non-0 on failure", because
 (a) we don't document what the actual API for these is
 (b) the code which calls the methods and tests their
 return result is just doing "if (ret)" tests

If we want to tighten this up to "return an errno"
I think we would need to
 (a) audit all the implementations
 (b) document the updated API in vmstate.h and perhaps
 also docs/devel/migration.rst

thanks
-- PMM

Re: [PATCH] util/interval-tree: Avoid race conditions without optimization

2023-07-13 Thread Peter Maydell

On Fri, 7 Jul 2023 at 11:30, Richard Henderson
 wrote:
>
> Read the left and right trees once, so that the gating
> tests are meaningful.  This was only a problem at -O0,
> where the compiler didn't CSE the two reads.
>
> Cc: qemu-sta...@nongnu.org
> Signed-off-by: Richard Henderson 

Reviewed-by: Peter Maydell 

If this data structure is intended to support operations
being done on it while it's being mutated, shouldn't it
be using the atomic accessors, though? That would make
it clearer that you can't just undo the transformation
made by this patch.

thanks
-- PMM

Re: [PATCH 0/1] hw/arm/sbsa-ref: set 'slots' property of xhci

2023-07-13 Thread Peter Maydell

On Mon, 10 Jul 2023 at 07:38, Yuquan Wang  wrote:
>
> As the default xhci_sysbus just supports only one usb slot, it can not
> meet the working requirement of this bord. Therefore, we extend the
> slots of xhci to 64.
>
> Yuquan Wang (1):
>   hw/arm/sbsa-ref: set 'slots' property of xhci
>
>  hw/arm/sbsa-ref.c | 1 +
>  1 file changed, 1 insertion(+)



Applied to target-arm.next, thanks.

-- PMM

Re: [PATCH 1/3] scsi: fetch unit attention when creating the request

2023-07-13 Thread Stefano Garzarella


On Wed, Jul 12, 2023 at 06:38:14PM +0200, Paolo Bonzini wrote:

On 7/12/23 15:43, Stefano Garzarella wrote:

Commit 1880ad4f4e ("virtio-scsi: Batched prepare for cmd reqs") split
calls to scsi_req_new() and scsi_req_enqueue() in the virtio-scsi device.
This had no drawback, until commit 8cc5583abe ("virtio-scsi: Send


More precisely, it was pretty hard to trigger it; it might be possible 
using a CD-ROM, as it can report a MEDIUM_CHANGED unit attention.  I 
will change "This had no drawback" to "No ill effect was reported"


Yep, agree!




"REPORTED LUNS CHANGED" sense data upon disk hotplug events") added a
bus unit attention.


... that was fairly easy to trigger via SCSI device hotplug/hot-unplug.

Queued the series, thanks for the tests and for applying the cleanups 
on top.


Thanks for queueing!




Co-developed-by: Paolo Bonzini 


Heh, I basically only wrote the "if (req->init_req)" statement so 
that's pretty generous, but I'll keep it anyway. :)


Your help was invaluable ;-)

Thanks,
Stefano

Re: [PATCH 0/3] hw/arm/virt: Use generic CPU invalidation

2023-07-13 Thread Peter Maydell

On Thu, 13 Jul 2023 at 06:45, Gavin Shan  wrote:
>
> There is a generic CPU type invalidation in machine_run_board_init()
> and we needn't a same and private invalidation for hw/arm/virt machines.
> This series intends to use the generic CPU type invalidation on the
> hw/arm/virt machines.
>
> PATCH[1] factors the CPU type invalidation logic in machine_run_board_init()
>  to a helper validate_cpu_type().
> PATCH[2] uses the generic CPU type invalidation for hw/arm/virt machines
> PATCH[3] support "host-arm-cpu" CPU type only when KVM or HVF is visible
>
> Testing
> ===
>
> With the following command lines, the output messages are varied before
> and after the series is applied.
>
>   /home/gshan/sandbox/src/qemu/main/build/qemu-system-aarch64 \
>   -accel tcg -machine virt,gic-version=3,nvdimm=on\
>   -cpu cortex-a8 -smp maxcpus=2,cpus=1\
> :
>
> Before the series is applied:
>
>   qemu-system-aarch64: mach-virt: CPU type cortex-a8-arm-cpu not supported
>
> After the series is applied:
>
>   qemu-system-aarch64: Invalid CPU type: cortex-a8-arm-cpu
>   The valid types are: cortex-a7-arm-cpu, cortex-a15-arm-cpu, \
>   cortex-a35-arm-cpu, cortex-a55-arm-cpu, cortex-a72-arm-cpu, \
>   cortex-a76-arm-cpu, a64fx-arm-cpu, neoverse-n1-arm-cpu, \
>   neoverse-v1-arm-cpu, cortex-a53-arm-cpu, cortex-a57-arm-cpu, \
>   max-arm-cpu

I see this isn't a change in this patch, but given that
what the user specifies is not "cortex-a8-arm-cpu" but
"cortex-a8", why do we include the "-arm-cpu" suffix in
the error messages? It's not valid syntax to say
"-cpu cortex-a8-arm-cpu", so it's a bit misleading...

-- PMM

Re: [PATCH v2] hw/mips: Improve the default USB settings in the loongson3-virt machine

2023-07-13 Thread Michael Tokarev


13.07.2023 13:09, Philippe Mathieu-Daudé wrote:

Hi Thomas,

On 21/6/23 09:41, Thomas Huth wrote:

It's possible to compile QEMU without the USB devices (e.g. when using
"--without-default-devices" as option for the "configure" script).
To be still able to run the loongson3-virt machine in default mode with
such a QEMU binary, we have to check here for the availability of the
USB devices first before instantiating them.

Signed-off-by: Thomas Huth 
---
  v2: Use #ifdef instead of runtime check

  hw/mips/loongson3_virt.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/hw/mips/loongson3_virt.c b/hw/mips/loongson3_virt.c
index 216812f660..3094413eea 100644
--- a/hw/mips/loongson3_virt.c
+++ b/hw/mips/loongson3_virt.c
@@ -51,6 +51,7 @@
  #include "sysemu/reset.h"
  #include "sysemu/runstate.h"
  #include "qemu/error-report.h"
+#include CONFIG_DEVICES


I'm a but reluctant to include CONFIG_DEVICES.


+#ifdef CONFIG_USB_OHCI_PCI
  if (defaults_enabled()) {


What about:

   if (defaults_enabled() && object_class_by_name(TYPE_PCI_OHCI)) {


I think it was a v1 like this :)

/mjt

Re: [PATCH v2] hw/mips: Improve the default USB settings in the loongson3-virt machine

2023-07-13 Thread Thomas Huth


On 13/07/2023 13.47, Michael Tokarev wrote:

13.07.2023 13:09, Philippe Mathieu-Daudé wrote:

Hi Thomas,

On 21/6/23 09:41, Thomas Huth wrote:

It's possible to compile QEMU without the USB devices (e.g. when using
"--without-default-devices" as option for the "configure" script).
To be still able to run the loongson3-virt machine in default mode with
such a QEMU binary, we have to check here for the availability of the
USB devices first before instantiating them.

Signed-off-by: Thomas Huth 
---
  v2: Use #ifdef instead of runtime check

  hw/mips/loongson3_virt.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/hw/mips/loongson3_virt.c b/hw/mips/loongson3_virt.c
index 216812f660..3094413eea 100644
--- a/hw/mips/loongson3_virt.c
+++ b/hw/mips/loongson3_virt.c
@@ -51,6 +51,7 @@
  #include "sysemu/reset.h"
  #include "sysemu/runstate.h"
  #include "qemu/error-report.h"
+#include CONFIG_DEVICES


I'm a but reluctant to include CONFIG_DEVICES.


+#ifdef CONFIG_USB_OHCI_PCI
  if (defaults_enabled()) {


What about:

   if (defaults_enabled() && object_class_by_name(TYPE_PCI_OHCI)) {


I think it was a v1 like this :)


Yes, that's how I did it in v1:

 https://lore.kernel.org/qemu-devel/20230525064731.1854107-1-th...@redhat.com/

 Thomas

Re: [PATCH 0/3] hw/arm/virt: Use generic CPU invalidation

2023-07-13 Thread Marcin Juszkiewicz


W dniu 13.07.2023 o 13:44, Peter Maydell pisze:


I see this isn't a change in this patch, but given that
what the user specifies is not "cortex-a8-arm-cpu" but
"cortex-a8", why do we include the "-arm-cpu" suffix in
the error messages? It's not valid syntax to say
"-cpu cortex-a8-arm-cpu", so it's a bit misleading...


Internally those cpu names are "max-{TYPE_ARM_CPU}" and similar for 
other architectures.


I like the change but it (IMHO) needs to cut "-{TYPE_*_CPU}" string from 
names:


13:37 marcin@applejack:qemu$ ./build/aarch64-softmmu/qemu-system-aarch64 
-M virt -cpu cortex-r5

qemu-system-aarch64: Invalid CPU type: cortex-r5-arm-cpu
The valid types are: cortex-a7-arm-cpu, cortex-a15-arm-cpu, 
cortex-a35-arm-cpu, cortex-a55-arm-cpu, cortex-a72-arm-cpu, 
cortex-a76-arm-cpu, a64fx-arm-cpu, neoverse-n1-arm-cpu, 
neoverse-v1-arm-cpu, cortex-a53-arm-cpu, cortex-a57-arm-cpu, 
host-arm-cpu, max-arm-cpu


13:37 marcin@applejack:qemu$ ./build/aarch64-softmmu/qemu-system-aarch64 
-M virt -cpu cortex-a57-arm-cpu

qemu-system-aarch64: unable to find CPU model 'cortex-a57-arm-cpu'


I can change sbsa-ref to follow that change but let it be userfriendly.

Re: [PATCH 0/3] hw/arm/virt: Use generic CPU invalidation

2023-07-13 Thread Peter Maydell

On Thu, 13 Jul 2023 at 12:52, Marcin Juszkiewicz
 wrote:
>
> W dniu 13.07.2023 o 13:44, Peter Maydell pisze:
>
> > I see this isn't a change in this patch, but given that
> > what the user specifies is not "cortex-a8-arm-cpu" but
> > "cortex-a8", why do we include the "-arm-cpu" suffix in
> > the error messages? It's not valid syntax to say
> > "-cpu cortex-a8-arm-cpu", so it's a bit misleading...
>
> Internally those cpu names are "max-{TYPE_ARM_CPU}" and similar for
> other architectures.

Yes; my question is "why are we not using the user-facing
string rather than the internal type name?".

-- PMM

[PATCH] accel/tcg: Zero-pad vaddr in tlb_debug output

2023-07-13 Thread Anton Johansson via

In replacing target_ulong with vaddr and TARGET_FMT_lx with VADDR_PRIx,
the zero-padding of TARGET_FMT_lx got lost.  Readd 16-wide zero-padding
for logging consistency.

Suggested-by: Peter Maydell 
Signed-off-by: Anton Johansson 
---
 accel/tcg/cputlb.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 5b51eff5a4..12a4b2a187 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -497,8 +497,8 @@ static void tlb_flush_page_locked(CPUArchState *env, int 
midx, vaddr page)
 
 /* Check if we need to flush due to large pages.  */
 if ((page & lp_mask) == lp_addr) {
-tlb_debug("forcing full flush midx %d (%"
-  VADDR_PRIx "/%" VADDR_PRIx ")\n",
+tlb_debug("forcing full flush midx %d (%016"
+  VADDR_PRIx "/%016" VADDR_PRIx ")\n",
   midx, lp_addr, lp_mask);
 tlb_flush_one_mmuidx_locked(env, midx, get_clock_realtime());
 } else {
@@ -527,7 +527,7 @@ static void tlb_flush_page_by_mmuidx_async_0(CPUState *cpu,
 
 assert_cpu_is_self(cpu);
 
-tlb_debug("page addr: %" VADDR_PRIx " mmu_map:0x%x\n", addr, idxmap);
+tlb_debug("page addr: %016" VADDR_PRIx " mmu_map:0x%x\n", addr, idxmap);
 
 qemu_spin_lock(&env_tlb(env)->c.lock);
 for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
@@ -591,7 +591,7 @@ static void tlb_flush_page_by_mmuidx_async_2(CPUState *cpu,
 
 void tlb_flush_page_by_mmuidx(CPUState *cpu, vaddr addr, uint16_t idxmap)
 {
-tlb_debug("addr: %" VADDR_PRIx " mmu_idx:%" PRIx16 "\n", addr, idxmap);
+tlb_debug("addr: %016" VADDR_PRIx " mmu_idx:%" PRIx16 "\n", addr, idxmap);
 
 /* This should already be page aligned */
 addr &= TARGET_PAGE_MASK;
@@ -625,7 +625,7 @@ void tlb_flush_page(CPUState *cpu, vaddr addr)
 void tlb_flush_page_by_mmuidx_all_cpus(CPUState *src_cpu, vaddr addr,
uint16_t idxmap)
 {
-tlb_debug("addr: %" VADDR_PRIx " mmu_idx:%"PRIx16"\n", addr, idxmap);
+tlb_debug("addr: %016" VADDR_PRIx " mmu_idx:%"PRIx16"\n", addr, idxmap);
 
 /* This should already be page aligned */
 addr &= TARGET_PAGE_MASK;
@@ -666,7 +666,7 @@ void tlb_flush_page_by_mmuidx_all_cpus_synced(CPUState 
*src_cpu,
   vaddr addr,
   uint16_t idxmap)
 {
-tlb_debug("addr: %" VADDR_PRIx " mmu_idx:%"PRIx16"\n", addr, idxmap);
+tlb_debug("addr: %016" VADDR_PRIx " mmu_idx:%"PRIx16"\n", addr, idxmap);
 
 /* This should already be page aligned */
 addr &= TARGET_PAGE_MASK;
@@ -728,7 +728,7 @@ static void tlb_flush_range_locked(CPUArchState *env, int 
midx,
  */
 if (mask < f->mask || len > f->mask) {
 tlb_debug("forcing full flush midx %d ("
-  "%" VADDR_PRIx "/%" VADDR_PRIx "+%" VADDR_PRIx ")\n",
+  "%016" VADDR_PRIx "/%016" VADDR_PRIx "+%016" VADDR_PRIx 
")\n",
   midx, addr, mask, len);
 tlb_flush_one_mmuidx_locked(env, midx, get_clock_realtime());
 return;
@@ -741,7 +741,7 @@ static void tlb_flush_range_locked(CPUArchState *env, int 
midx,
  */
 if (((addr + len - 1) & d->large_page_mask) == d->large_page_addr) {
 tlb_debug("forcing full flush midx %d ("
-  "%" VADDR_PRIx "/%" VADDR_PRIx ")\n",
+  "%016" VADDR_PRIx "/%016" VADDR_PRIx ")\n",
   midx, d->large_page_addr, d->large_page_mask);
 tlb_flush_one_mmuidx_locked(env, midx, get_clock_realtime());
 return;
@@ -773,7 +773,7 @@ static void tlb_flush_range_by_mmuidx_async_0(CPUState *cpu,
 
 assert_cpu_is_self(cpu);
 
-tlb_debug("range: %" VADDR_PRIx "/%u+%" VADDR_PRIx " mmu_map:0x%x\n",
+tlb_debug("range: %016" VADDR_PRIx "/%u+%016" VADDR_PRIx " mmu_map:0x%x\n",
   d.addr, d.bits, d.len, d.idxmap);
 
 qemu_spin_lock(&env_tlb(env)->c.lock);
@@ -1165,7 +1165,7 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
 &xlat, &sz, full->attrs, 
&prot);
 assert(sz >= TARGET_PAGE_SIZE);
 
-tlb_debug("vaddr=%" VADDR_PRIx " paddr=0x" HWADDR_FMT_plx
+tlb_debug("vaddr=%016" VADDR_PRIx " paddr=0x" HWADDR_FMT_plx
   " prot=%x idx=%d\n",
   addr, full->phys_addr, prot, mmu_idx);
 
-- 
2.41.0

[PATCH] target/hexagon/idef-parser: Remove self-assignment

2023-07-13 Thread Anton Johansson via

The self assignment is clearly useless, and @1.last_column does not have
to be set for an expression with only a single token, so remove it.

Reported-by: Peter Maydell 
Signed-off-by: Anton Johansson 
---
 target/hexagon/idef-parser/idef-parser.y | 1 -
 1 file changed, 1 deletion(-)

diff --git a/target/hexagon/idef-parser/idef-parser.y 
b/target/hexagon/idef-parser/idef-parser.y
index cd2612eb8c..a6587f5bcc 100644
--- a/target/hexagon/idef-parser/idef-parser.y
+++ b/target/hexagon/idef-parser/idef-parser.y
@@ -802,7 +802,6 @@ rvalue : FAIL
 
 lvalue : FAIL
  {
- @1.last_column = @1.last_column;
  yyassert(c, &@1, false, "Encountered a FAIL token as lvalue.\n");
  }
| REG
-- 
2.41.0

Re: [PATCH QEMU v8 2/9] qapi/migration: Introduce x-vcpu-dirty-limit-period parameter

2023-07-13 Thread Markus Armbruster

~hyman  writes:

> From: Hyman Huang(黄勇) 
>
> Introduce "x-vcpu-dirty-limit-period" migration experimental
> parameter, which is in the range of 1 to 1000ms and used to
> make dirtyrate calculation period configurable.

dirty rate

>
> Currently with the "x-vcpu-dirty-limit-period" varies, the

Currently, as the

> total time of live migration changes, test results show the

Suggest period instead of comma.

> optimal value of "x-vcpu-dirty-limit-period" ranges from
> 500ms to 1000 ms. "x-vcpu-dirty-limit-period" should be made
> stable once it proves best value can not be determined with
> developer's experiments.
>
> Signed-off-by: Hyman Huang(黄勇) 
> Reviewed-by: Markus Armbruster 
> Reviewed-by: Juan Quintela 

[...]

> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -789,9 +789,14 @@
>  # Nodes are mapped to their block device name if there is one, and
>  # to their node name otherwise.  (Since 5.2)
>  #
> +# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty
> +# limit during live migration. Should be in the range 1 to 1000ms,
> +# defaults to 1000ms. (Since 8.1)

Two spaces after sentence-ending punctuation, please:

   # @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty
   # limit during live migration.  Should be in the range 1 to
   # 1000ms, defaults to 1000ms.  (Since 8.1)

> +#
>  # Features:
>  #
> -# @unstable: Member @x-checkpoint-delay is experimental.
> +# @unstable: Members @x-checkpoint-delay and
> +# @x-vcpu-dirty-limit-period are experimental.
>  #
>  # Since: 2.4
>  ##
> @@ -809,8 +814,10 @@
> 'multifd-channels',
> 'xbzrle-cache-size', 'max-postcopy-bandwidth',
> 'max-cpu-throttle', 'multifd-compression',
> -   'multifd-zlib-level' ,'multifd-zstd-level',
> -   'block-bitmap-mapping' ] }
> +   'multifd-zlib-level', 'multifd-zstd-level',
> +   'block-bitmap-mapping',
> +   { 'name': 'x-vcpu-dirty-limit-period',
> + 'features': ['unstable'] } ] }
>  
>  ##
>  # @MigrateSetParameters:
> @@ -945,9 +952,14 @@
>  # Nodes are mapped to their block device name if there is one, and
>  # to their node name otherwise.  (Since 5.2)
>  #
> +# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty
> +# limit during live migration. Should be in the range 1 to 1000ms,
> +# defaults to 1000ms. (Since 8.1)

Likewise.

> +#
>  # Features:
>  #
> -# @unstable: Member @x-checkpoint-delay is experimental.
> +# @unstable: Members @x-checkpoint-delay and
> +# @x-vcpu-dirty-limit-period are experimental.
>  #
>  # TODO: either fuse back into MigrationParameters, or make
>  # MigrationParameters members mandatory
> @@ -982,7 +994,9 @@
>  '*multifd-compression': 'MultiFDCompression',
>  '*multifd-zlib-level': 'uint8',
>  '*multifd-zstd-level': 'uint8',
> -'*block-bitmap-mapping': [ 'BitmapMigrationNodeAlias' ] } }
> +'*block-bitmap-mapping': [ 'BitmapMigrationNodeAlias' ],
> +'*x-vcpu-dirty-limit-period': { 'type': 'uint64',
> +'features': [ 'unstable' ] } } }
>  
>  ##
>  # @migrate-set-parameters:
> @@ -1137,9 +1151,14 @@
>  # Nodes are mapped to their block device name if there is one, and
>  # to their node name otherwise.  (Since 5.2)
>  #
> +# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty
> +# limit during live migration. Should be in the range 1 to 1000ms,
> +# defaults to 1000ms. (Since 8.1)

Likewise.

> +#
>  # Features:
>  #
> -# @unstable: Member @x-checkpoint-delay is experimental.
> +# @unstable: Members @x-checkpoint-delay and
> +# @x-vcpu-dirty-limit-period are experimental.
>  #
>  # Since: 2.4
>  ##
> @@ -1171,7 +1190,9 @@
>  '*multifd-compression': 'MultiFDCompression',
>  '*multifd-zlib-level': 'uint8',
>  '*multifd-zstd-level': 'uint8',
> -'*block-bitmap-mapping': [ 'BitmapMigrationNodeAlias' ] } }
> +'*block-bitmap-mapping': [ 'BitmapMigrationNodeAlias' ],
> +'*x-vcpu-dirty-limit-period': { 'type': 'uint64',
> +'features': [ 'unstable' ] } } }
>  
>  ##
>  # @query-migrate-parameters:

The issues I'm pointing out don't justfy yet another respin.  But if you
need to respin the series for some other reason, plase take care of them.

Re: [PATCH 0/3] hw/arm/virt: Use generic CPU invalidation

2023-07-13 Thread Gavin Shan


Hi Peter and Marcin,

On 7/13/23 21:52, Marcin Juszkiewicz wrote:

W dniu 13.07.2023 o 13:44, Peter Maydell pisze:


I see this isn't a change in this patch, but given that
what the user specifies is not "cortex-a8-arm-cpu" but
"cortex-a8", why do we include the "-arm-cpu" suffix in
the error messages? It's not valid syntax to say
"-cpu cortex-a8-arm-cpu", so it's a bit misleading...


Internally those cpu names are "max-{TYPE_ARM_CPU}" and similar for other 
architectures.

I like the change but it (IMHO) needs to cut "-{TYPE_*_CPU}" string from names:

13:37 marcin@applejack:qemu$ ./build/aarch64-softmmu/qemu-system-aarch64 -M 
virt -cpu cortex-r5
qemu-system-aarch64: Invalid CPU type: cortex-r5-arm-cpu
The valid types are: cortex-a7-arm-cpu, cortex-a15-arm-cpu, cortex-a35-arm-cpu, 
cortex-a55-arm-cpu, cortex-a72-arm-cpu, cortex-a76-arm-cpu, a64fx-arm-cpu, 
neoverse-n1-arm-cpu, neoverse-v1-arm-cpu, cortex-a53-arm-cpu, 
cortex-a57-arm-cpu, host-arm-cpu, max-arm-cpu

13:37 marcin@applejack:qemu$ ./build/aarch64-softmmu/qemu-system-aarch64 -M 
virt -cpu cortex-a57-arm-cpu
qemu-system-aarch64: unable to find CPU model 'cortex-a57-arm-cpu'



The suffix of CPU types are provided in hw/arm/virt.c::valid_cpu_types in 
PATCH[2].
In the generic validation, the complete CPU type is used. The error message also
have complete CPU type there.

Peter and Marcin, how about to split the CPU types to two fields, as below? In 
this
way, the complete CPU type will be used for validation and the 'internal' names 
will
be used for the error messages.

struct MachineClass {
const char *valid_cpu_type_suffix;
const char **valid_cpu_types;
};

hw/arm/virt.c
-

static const char *valid_cpu_types[] = {
#ifdef CONFIG_TCG
"cortex-a7",
"cortex-a15",
"cortex-a35",
"cortex-a55",
"cortex-a72",
"cortex-a76",
"a64fx",
"neoverse-n1",
"neoverse-v1",
#endif
"cortex-a53",
"cortex-a57",
"host",
"max",
NULL
};

static void virt_machine_class_init(ObjectClass *oc, void *data)
{
mc->valid_cpu_type_suffix = TYPE_ARM_CPU;
mc->valid_cpu_types = valid_cpu_types;
}

hw/core/machine.c
-

static void validate_cpu_type(MachineState *machine)
{
ObjectClass *oc = object_class_by_name(machine->cpu_type), *ret = NULL;
CPUClass *cc = CPU_CLASS(oc);
char *cpu_type;
int i;

if (!machine->cpu_type || !machine_class->valid_cpu_types) {
goto out_no_check;
}

for (i = 0; machine_class->valid_cpu_types[i]; i++) {
/* The class name is the combination of prefix + suffix */
if (machine_class->valid_cpu_type_suffix) {
cpu_type = g_strdup_printf("%s-%s", machine_class->valid_cpu_types[i], 
machine_class->valid_cpu_type_suffix);
} else {
cpu_type = g_strdup(machine_class->valid_cpu_types[i]);
}

ret = object_class_dynamic_cast(oc, cpu_type));
g_free(cpu_type);
if (ret) {
break;
}
}

if (!machine_class->valid_cpu_types[i]) {
/* The user-specified CPU type is invalid */
/ the prefix is used in the error messages /
error_report("Invalid CPU type: %s", machine->cpu_type);
error_printf("The valid types are: %s",
 machine_class->valid_cpu_types[0]);
for (i = 1; machine_class->valid_cpu_types[i]; i++) {
error_printf(", %s", machine_class->valid_cpu_types[i]);
}
error_printf("\n");

exit(1);
}

/* Check if CPU type is deprecated and warn if so */
out_no_check:
if (cc && cc->deprecation_note) {
warn_report("CPU model %s is deprecated -- %s",
machine->cpu_type, cc->deprecation_note);
}
}



I can change sbsa-ref to follow that change but let it be userfriendly.



Yes, sbsa-ref needs an extra patch to use the generic invalidation. I can add
one patch on the top for that in next revision if you agree, Marcin.

Thanks,
Gavin

Re: [PATCH V9 32/46] vfio-pci: cpr part 2 (msi)

2023-07-13 Thread Kunkun Jiang via


Hi Steve,

On 2023/7/10 23:43, Steven Sistare wrote:

On 7/5/2023 4:56 AM, Kunkun Jiang wrote:

Hi Steve,

I have a few questions about the msi part of the vfio device.
In the reboot mode, you mentioned "The guest drivers' suspend methods
flush outstanding requests and re-initialize the devices". This means,
during the downtime,the vfio device dose not generate interrupts. So
no special processing is required for the msi interrupt of the vfio
device. Am I right?

Correct.


In the cpr-exec mode, will the vfio device be "stopped"? If the vfio device
is running all the time,it is possible to generate interrrupts during the
downtime. How to deal with these interrupts?

The vfio device is not stopped, but its connection to the kvm instance is 
severed.
Interrupts are pended in the vfio kernel state, and that state is preserved 
across
exec, by preserving the vfio descriptors.  After exec, qemu creates a new kvm 
instance,
attaches vfio to it, and the interrupts are delivered.


In addition, ARM GICv4 provides support for the direct injection of vLPIs.
Interrupts are more difficult to handle. In this case, what should be done?

I have not studied this feature or tried it.
Have you analyzed the VT-D post-interrupt feature? It is similar to the 
direct
injection of vLPI. They all implement the interrupt injecting channel by 
executing

irq_bypass_register_consumer.

According to the current kernel code, it need to first cancel the 
connection between
vfio producer and kvm consumer, then establishes the connection between 
vfio producer
and the new kvm consumer. During the unbinding process, both ARM and x86 
will execute
the callback of *_set_vcpu_affinity to modify the interrupt reporting 
channel. For x86,
in the process of stop posting interrupts, back to remapping mode, 
cmpxchg_double is

used in the code. Does this guarantee that interrupts will not be lost?

For ARM, it will first send a DISCARD command to ITS and then establish 
the interrupt
reporting channel for GICv3. The DISCARD will remove the pending 
interrupt. Interrupts

that come before channel re-establishment are silently discarded.
Do you guys have any good ideas?

Look forward to your reply.
Kunkun Jiang


- Steve


Look forward to your reply.

Kunkun Jiang

On 2022/7/27 0:10, Steve Sistare wrote:

Finish cpr for vfio-pci MSI/MSI-X devices by preserving eventfd's and
vector state.

Signed-off-by: Steve Sistare 
---
   hw/vfio/pci.c | 119 
+-
   1 file changed, 118 insertions(+), 1 deletion(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index b5fd2ec..1d0e8db 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -54,17 +54,47 @@ static void vfio_disable_interrupts(VFIOPCIDevice *vdev);
   static void vfio_mmap_set_enabled(VFIOPCIDevice *vdev, bool enabled);
   static void vfio_msi_disable_common(VFIOPCIDevice *vdev);
   +#define EVENT_FD_NAME(vdev, name)   \
+    g_strdup_printf("%s_%s", (vdev)->vbasedev.name, (name))
+
+static void save_event_fd(VFIOPCIDevice *vdev, const char *name, int nr,
+  EventNotifier *ev)
+{
+    int fd = event_notifier_get_fd(ev);
+
+    if (fd >= 0) {
+    g_autofree char *fdname = EVENT_FD_NAME(vdev, name);
+
+    cpr_resave_fd(fdname, nr, fd);
+    }
+}
+
+static int load_event_fd(VFIOPCIDevice *vdev, const char *name, int nr)
+{
+    g_autofree char *fdname = EVENT_FD_NAME(vdev, name);
+    return cpr_find_fd(fdname, nr);
+}
+
+static void delete_event_fd(VFIOPCIDevice *vdev, const char *name, int nr)
+{
+    g_autofree char *fdname = EVENT_FD_NAME(vdev, name);
+    cpr_delete_fd(fdname, nr);
+}
+
   /* Create new or reuse existing eventfd */
   static int vfio_notifier_init(VFIOPCIDevice *vdev, EventNotifier *e,
     const char *name, int nr)
   {
-    int fd = -1;   /* placeholder until a subsequent patch */
   int ret = 0;
+    int fd = load_event_fd(vdev, name, nr);
     if (fd >= 0) {
   event_notifier_init_fd(e, fd);
   } else {
   ret = event_notifier_init(e, 0);
+    if (!ret) {
+    save_event_fd(vdev, name, nr, e);
+    }
   }
   return ret;
   }
@@ -72,6 +102,7 @@ static int vfio_notifier_init(VFIOPCIDevice *vdev, 
EventNotifier *e,
   static void vfio_notifier_cleanup(VFIOPCIDevice *vdev, EventNotifier *e,
     const char *name, int nr)
   {
+    delete_event_fd(vdev, name, nr);
   event_notifier_cleanup(e);
   }
   @@ -512,6 +543,15 @@ static int vfio_msix_vector_do_use(PCIDevice *pdev, 
unsigned int nr,
   VFIOMSIVector *vector;
   int ret;
   +    /*
+ * Ignore the callback from msix_set_vector_notifiers during resume.
+ * The necessary subset of these actions is called from vfio_claim_vectors
+ * during post load.
+ */
+    if (vdev->vbasedev.reused) {
+    return 0;
+    }
+
   trace_vfio_msix_vector_do_use(vdev->vbasedev.name, nr);
     vector = &vdev->msi_vecto

Re: [PATCH] tcg: Fix info_in_idx increment in layout_arg_by_ref

2023-07-13 Thread Peter Maydell

On Fri, 7 Jul 2023 at 11:29, Richard Henderson
 wrote:
>
> Off by one error, failing to take into account that layout_arg_1
> already incremeneted info_in_idx for the first piece.  We only
> need care for the n-1 TCG_CALL_ARG_BY_REF_N pieces here.
>
> Cc: qemu-sta...@nongnu.org
> Fixes: 313bdea84d2 ("tcg: Add TCG_CALL_{RET,ARG}_BY_REF")
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1751
> Signed-off-by: Richard Henderson 
> ---

Tested-by: Peter Maydell 

-- PMM

Re: [PATCH V9 32/46] vfio-pci: cpr part 2 (msi)

2023-07-13 Thread Marc Zyngier

On Thu, 13 Jul 2023 13:35:57 +0100,
Kunkun Jiang  wrote:
> 
> For ARM, it will first send a DISCARD command to ITS and then
> establish the interrupt reporting channel for GICv3. The DISCARD
> will remove the pending interrupt. Interrupts that come before
> channel re-establishment are silently discarded.  Do you guys have
> any good ideas?

I'm missing the context, but if you're worried about interrupts that
are lost between the DISCARD and the MAPTI commands, the only way to
solve the problem is to inject a spurious interrupt after the MAPTI
has taken place.

If it hurts, don't do that.

M.

-- 
Without deviation from the norm, progress is not possible.

Re: [PATCH 0/3] hw/arm/virt: Use generic CPU invalidation

2023-07-13 Thread Gavin Shan


Hi Peter,

On 7/13/23 21:44, Peter Maydell wrote:

On Thu, 13 Jul 2023 at 06:45, Gavin Shan  wrote:


There is a generic CPU type invalidation in machine_run_board_init()
and we needn't a same and private invalidation for hw/arm/virt machines.
This series intends to use the generic CPU type invalidation on the
hw/arm/virt machines.

PATCH[1] factors the CPU type invalidation logic in machine_run_board_init()
  to a helper validate_cpu_type().
PATCH[2] uses the generic CPU type invalidation for hw/arm/virt machines
PATCH[3] support "host-arm-cpu" CPU type only when KVM or HVF is visible

Testing
===

With the following command lines, the output messages are varied before
and after the series is applied.

   /home/gshan/sandbox/src/qemu/main/build/qemu-system-aarch64 \
   -accel tcg -machine virt,gic-version=3,nvdimm=on\
   -cpu cortex-a8 -smp maxcpus=2,cpus=1\
 :

Before the series is applied:

   qemu-system-aarch64: mach-virt: CPU type cortex-a8-arm-cpu not supported

After the series is applied:

   qemu-system-aarch64: Invalid CPU type: cortex-a8-arm-cpu
   The valid types are: cortex-a7-arm-cpu, cortex-a15-arm-cpu, \
   cortex-a35-arm-cpu, cortex-a55-arm-cpu, cortex-a72-arm-cpu, \
   cortex-a76-arm-cpu, a64fx-arm-cpu, neoverse-n1-arm-cpu, \
   neoverse-v1-arm-cpu, cortex-a53-arm-cpu, cortex-a57-arm-cpu, \
   max-arm-cpu


I see this isn't a change in this patch, but given that
what the user specifies is not "cortex-a8-arm-cpu" but
"cortex-a8", why do we include the "-arm-cpu" suffix in
the error messages? It's not valid syntax to say
"-cpu cortex-a8-arm-cpu", so it's a bit misleading...



Good point. Right, the complete CPU type names are provided by board 
(hw/arm/virt).
The compelte CPU type names are used in hw/core/machine.c to search the object
class. In the error messages in the same source file, the complete CPU type 
names
are also used. Actually, we need 'internal' names like 'cortex-a8' to be shown 
in the
error messages.

For the solution, I've suggested to split the MachineClass::valid_cpu_types to
two fields (valid_cpu_types and valid_cpu_type_suffix). Their combination is
the complete CPU type name and 'valid_cpu_types[i]' corresponds to the 
'internal'
name, to be used in the error messages. Please take a look on that thread and 
reply
to it.

Thanks,
Gavin

Re: [PATCH 0/3] hw/arm/virt: Use generic CPU invalidation

2023-07-13 Thread Marcin Juszkiewicz


W dniu 13.07.2023 o 14:34, Gavin Shan pisze:

On 7/13/23 21:52, Marcin Juszkiewicz wrote:

W dniu 13.07.2023 o 13:44, Peter Maydell pisze:

I see this isn't a change in this patch, but given that what the 
user specifies is not "cortex-a8-arm-cpu" but "cortex-a8", why

do we include the "-arm-cpu" suffix in the error messages? It's
not valid syntax to say "-cpu cortex-a8-arm-cpu", so it's a bit 
misleading...



I like the change but it (IMHO) needs to cut "-{TYPE_*_CPU}"
string from names:


Peter and Marcin, how about to split the CPU types to two fields, as 
below? In this way, the complete CPU type will be used for validation

and the 'internal' names will be used for the error messages.


Note that it should probably propagate to all architectures handled in 
QEMU, not only Arm ones. I do not know which machines have list of 
supported cpu types like arm/virt or arm/sbsa-ref ones.
I can change sbsa-ref to follow that change but let it be 
userfriendly.



Yes, sbsa-ref needs an extra patch to use the generic invalidation.
I can add one patch on the top for that in next revision if you
agree, Marcin.


I am fine with it.

Re: [PATCH QEMU v8 4/9] migration: Introduce dirty-limit capability

2023-07-13 Thread Markus Armbruster

~hyman  writes:

> From: Hyman Huang(黄勇) 
>
> Introduce migration dirty-limit capability, which can
> be turned on before live migration and limit dirty
> page rate durty live migration.
>
> Introduce migrate_dirty_limit function to help check
> if dirty-limit capability enabled during live migration.
>
> Meanwhile, refactor vcpu_dirty_rate_stat_collect
> so that period can be configured instead of hardcoded.
>
> dirty-limit capability is kind of like auto-converge
> but using dirty limit instead of traditional cpu-throttle
> to throttle guest down. To enable this feature, turn on
> the dirty-limit capability before live migration using
> migrate-set-capabilities, and set the parameters
> "x-vcpu-dirty-limit-period", "vcpu-dirty-limit" suitably
> to speed up convergence.
>
> Signed-off-by: Hyman Huang(黄勇) 
> Acked-by: Peter Xu 
> Reviewed-by: Juan Quintela 

[...]

> diff --git a/qapi/migration.json b/qapi/migration.json
> index e43371955a..031832cde5 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -497,6 +497,15 @@
>  # are present.  'return-path' capability must be enabled to use
>  # it.  (since 8.1)
>  #
> +# @dirty-limit: If enabled, migration will use the dirty-limit
> +# algorithm to throttle down guest instead of auto-converge
> +# algorithm. This algorithm only works when vCPU's dirtyrate

Two spaces after sentence-ending punctuation, please.

"dirty rate" with a space, because that's how we spell it elsewhere.

> +# greater than 'vcpu-dirty-limit', read processes in guest os
> +# aren't penalized any more, so the algorithm can improve
> +# performance of vCPU during live migration. This is an optional
> +# performance feature and should not affect the correctness of the
> +# existing auto-converge algorithm. (since 8.1)
> +#

I'm still confused.

The text suggests there are two separate algorithms "to throttle down
guest": "auto converge" and "dirty limit", and we get to pick one.
Correct?

If it is correct, then the last sentence feels redundant: picking
another algorithm can't affect the algorithm we're *not* using.  What
are you trying to express here?

When do we use "auto converge", and when do we use "dirty limit"?

What does the user really need to know about these algorithms?  Enough
to pick one, I guess.  That means advantages and disadvantages of the
two algorithms.  Which are?

>  # Features:
>  #
>  # @unstable: Members @x-colo and @x-ignore-shared are experimental.
> @@ -512,7 +521,8 @@
> 'dirty-bitmaps', 'postcopy-blocktime', 'late-block-activate',
> { 'name': 'x-ignore-shared', 'features': [ 'unstable' ] },
> 'validate-uuid', 'background-snapshot',
> -   'zero-copy-send', 'postcopy-preempt', 'switchover-ack'] }
> +   'zero-copy-send', 'postcopy-preempt', 'switchover-ack',
> +   'dirty-limit'] }
>  
>  ##
>  # @MigrationCapabilityStatus:

[...]

Re: [PATCH QEMU v8 8/9] migration: Extend query-migrate to provide dirty page limit info

2023-07-13 Thread Markus Armbruster

~hyman  writes:

> From: Hyman Huang(黄勇) 
>
> Extend query-migrate to provide throttle time and estimated
> ring full time with dirty-limit capability enabled, through which
> we can observe if dirty limit take effect during live migration.
>
> Signed-off-by: Hyman Huang(黄勇) 
> Reviewed-by: Markus Armbruster 
> Reviewed-by: Juan Quintela 

[...]

> diff --git a/qapi/migration.json b/qapi/migration.json
> index 031832cde5..97f7d0fd3c 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -250,6 +250,18 @@
>  # blocked.  Present and non-empty when migration is blocked.
>  # (since 6.0)
>  #
> +# @dirty-limit-throttle-time-per-round: Maximum throttle time
> +# (in microseconds) of virtual CPUs each dirty ring full round,
> +# which shows how MigrationCapability dirty-limit affects the
> +# guest during live migration. (since 8.1)

Two spaces after sentence-ending punctuation, please:

   # guest during live migration.  (since 8.1)

Same below.

> +#
> +# @dirty-limit-ring-full-time: Estimated average dirty ring full
> +# time (in microseconds) each dirty ring full round. The value
> +# equals dirty ring memory size divided by average dirty page
> +# rate of the virtual CPU, which can be used to observe the
> +# average memory load of the virtual CPU indirectly. Note that
> +# zero means guest doesn't dirty memory (since 8.1)

Last sentence should end with with a period.

Together:

   # @dirty-limit-throttle-time-per-round: Maximum throttle time (in
   # microseconds) of virtual CPUs each dirty ring full round, which
   # shows how MigrationCapability dirty-limit affects the guest
   # during live migration.  (Since 8.1)
   #
   # @dirty-limit-ring-full-time: Estimated average dirty ring full time
   # (in microseconds) each dirty ring full round.  The value equals
   # dirty ring memory size divided by average dirty page rate of the
   # virtual CPU, which can be used to observe the average memory
   # load of the virtual CPU indirectly.  Note that zero means guest
   # doesn't dirty memory.  (Since 8.1)

> +#
>  # Since: 0.14
>  ##
>  { 'struct': 'MigrationInfo',
> @@ -267,7 +279,9 @@
> '*postcopy-blocktime' : 'uint32',
> '*postcopy-vcpu-blocktime': ['uint32'],
> '*compression': 'CompressionStats',
> -   '*socket-address': ['SocketAddress'] } }
> +   '*socket-address': ['SocketAddress'],
> +   '*dirty-limit-throttle-time-per-round': 'uint64',
> +   '*dirty-limit-ring-full-time': 'uint64'} }
>  
>  ##
>  # @query-migrate:

[...]

The issues I'm pointing out don't justfy yet another respin.  But if you
need to respin the series for some other reason, plase take care of them.

Re: [PATCH 3/3] hw/arm/virt: Support host CPU type only when KVM or HVF is configured

2023-07-13 Thread Cornelia Huck

On Thu, Jul 13 2023, Gavin Shan  wrote:

> The CPU type 'host-arm-cpu' class won't be registered until KVM or
> HVF is configured in target/arm/cpu64.c. Support the corresponding
> CPU type only when KVM or HVF is configured.
>
> Signed-off-by: Gavin Shan 
> ---
>  hw/arm/virt.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 43d7772ffd..ad28634445 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -217,7 +217,9 @@ static const char *valid_cpu_types[] = {
>  #endif
>  ARM_CPU_TYPE_NAME("cortex-a53"),
>  ARM_CPU_TYPE_NAME("cortex-a57"),
> +#if defined(CONFIG_KVM) || defined(CONFIG_HVF)
>  ARM_CPU_TYPE_NAME("host"),
> +#endif
>  ARM_CPU_TYPE_NAME("max"),
>  NULL
>  };

Doesn't the check in parse_cpu_option() already catch the case where
the "host" cpu model isn't registered? I might be getting lost in the
code flow, though.

Re: [PATCH 0/3] hw/arm/virt: Use generic CPU invalidation

2023-07-13 Thread Gavin Shan


Hi Marcin,

On 7/13/23 22:44, Marcin Juszkiewicz wrote:

W dniu 13.07.2023 o 14:34, Gavin Shan pisze:

On 7/13/23 21:52, Marcin Juszkiewicz wrote:

W dniu 13.07.2023 o 13:44, Peter Maydell pisze:


I see this isn't a change in this patch, but given that what the user specifies is not 
"cortex-a8-arm-cpu" but "cortex-a8", why
do we include the "-arm-cpu" suffix in the error messages? It's
not valid syntax to say "-cpu cortex-a8-arm-cpu", so it's a bit misleading...



I like the change but it (IMHO) needs to cut "-{TYPE_*_CPU}"
string from names:



Peter and Marcin, how about to split the CPU types to two fields, as below? In 
this way, the complete CPU type will be used for validation
and the 'internal' names will be used for the error messages.


Note that it should probably propagate to all architectures handled in QEMU, 
not only Arm ones. I do not know which machines have list of supported cpu 
types like arm/virt or arm/sbsa-ref ones.


Right, there are more boards eligible for this generic CPU type invalidation. I 
will
cover all of them in next revision. Currently, the pattern of prefix + suffix, 
used
to split the CPU type name, can work well for all cases.

[gshan@gshan hw]$ git grep strcmp.*cpu_type
arm/bananapi_m2u.c:if (strcmp(machine->cpu_type, 
ARM_CPU_TYPE_NAME("cortex-a7")) != 0) {
arm/cubieboard.c:if (strcmp(machine->cpu_type, 
ARM_CPU_TYPE_NAME("cortex-a8")) != 0) {
arm/mps2-tz.c:if (strcmp(machine->cpu_type, mc->default_cpu_type) != 0) {
arm/mps2.c:if (strcmp(machine->cpu_type, mc->default_cpu_type) != 0) {
arm/msf2-som.c:if (strcmp(machine->cpu_type, mc->default_cpu_type) != 0) {
arm/musca.c:if (strcmp(machine->cpu_type, mc->default_cpu_type) != 0) {
arm/npcm7xx_boards.c:if (strcmp(machine->cpu_type, mc->default_cpu_type) != 
0) {
arm/orangepi.c:if (strcmp(machine->cpu_type, 
ARM_CPU_TYPE_NAME("cortex-a7")) != 0) {
avr/boot.c:if (strcmp(elf_cpu, mcu_cpu_type)) {   
< needsn't CPU type validation
riscv/shakti_c.c:if (strcmp(mstate->cpu_type, TYPE_RISCV_CPU_SHAKTI_C) != 
0) {



I can change sbsa-ref to follow that change but let it be userfriendly.



Yes, sbsa-ref needs an extra patch to use the generic invalidation.
I can add one patch on the top for that in next revision if you
agree, Marcin.


I am fine with it.



Thanks a lot for your confirm, Marcin.

Thanks,
Gavin

Re: [PATCH 00/11] tpm: introduce TPM CRB SysBus device

2023-07-13 Thread Stefan Berger





On 7/12/23 23:51, Joelle van Dyne wrote:

The impetus for this patch set is to get TPM 2.0 working on Windows 11 ARM64.
Windows' tpm.sys does not seem to work on a TPM TIS device (as verified with
VMWare's implementation). However, the current TPM CRB device uses a fixed
system bus address that is reserved for RAM in ARM64 Virt machines.


Thanks a lot for this work. The last sentence seems to hint at the current issue
with TPM CRB on ARM64 and seems to be the only way forward there. You may want
to reformulate it a bit because it's not clear how the 'however' related to
CRB relates to TIS.



In the process of adding the TPM CRB SysBus device, we also went ahead and
cleaned up some of the existing TPM hardware code and fixed some bugs. We used


Please reorder bugs to the beginning of the series or submit in an extra patch 
set
so we can backport them. Ideal would be description(s) for how to trigger the 
bug(s).


the TPM TIS devices as a template for the TPM CRB devices and refactored out
common code. We moved the ACPI DSDT generation to the device in order to handle
dynamic base address requirements as well as reduce redundent code in different

s/redundent/redundant



machine ACPI generation. We also changed the tpm_crb device to use the ISA bus
instead of depending on the default system bus as the device only was built for
the PC configuration.

Another change is that the TPM CRB registers are now mapped in the same way that
the pflash ROM devices are mapped. It is a memory region whose writes are
trapped as MMIO accesses. This was needed because Apple Silicon does not decode
LDP caused page faults. @agraf suggested that we do this to avoid having to


Afaik, LDP is an ARM assembly instruction that loads two 32bit or 64bit 
registers from
consecutive addresses. May be worth mentioning for those wondering about it...


do AARCH64 decoding in the HVF fault handler.


What is HVF?

Regards,
   Stefan


Unfortunately, it seems like the LDP fault still happens on HVF but the issue
seems to be in the HVF backend which needs to be fixed in a separate patch.

One last thing that's needed to get Windows 11 to recognize the TPM 2.0 device
is for the OVMF firmware to setup the TPM device. Currently, OVMF for ARM64 Virt
only recognizes the TPM TIS device through a FDT entry. A workaround is to
falsely identify the TPM CRB device as a TPM TIS device in the FDT node but this
causes issues for Linux. A proper fix would involve adding an ACPI device driver
in OVMF.

Joelle van Dyne (11):
   tpm_crb: refactor common code
   tpm_crb: CTRL_RSP_ADDR is 64-bits wide
   tpm_ppi: refactor memory space initialization
   tpm_crb: use a single read-as-mem/write-as-mmio mapping
   tpm_crb: use the ISA bus
   tpm_crb: move ACPI table building to device interface
   hw/arm/virt: add plug handler for TPM on SysBus
   hw/loongarch/virt: add plug handler for TPM on SysBus
   tpm_tis_sysbus: fix crash when PPI is enabled
   tpm_tis_sysbus: move DSDT AML generation to device
   tpm_crb_sysbus: introduce TPM CRB SysBus device

  docs/specs/tpm.rst  |   2 +
  hw/tpm/tpm_crb.h|  74 +
  hw/tpm/tpm_ppi.h|  10 +-
  include/hw/acpi/aml-build.h |   1 +
  include/hw/acpi/tpm.h   |   3 +-
  include/sysemu/tpm.h|   3 +
  hw/acpi/aml-build.c |   7 +-
  hw/arm/virt-acpi-build.c|  38 +
  hw/arm/virt.c   |  38 +
  hw/core/sysbus-fdt.c|   1 +
  hw/i386/acpi-build.c|  23 ---
  hw/loongarch/acpi-build.c   |  38 +
  hw/loongarch/virt.c |  38 +
  hw/riscv/virt.c |   1 +
  hw/tpm/tpm_crb.c| 307 
  hw/tpm/tpm_crb_common.c | 224 ++
  hw/tpm/tpm_crb_sysbus.c | 178 +
  hw/tpm/tpm_ppi.c|   5 +-
  hw/tpm/tpm_tis_isa.c|   5 +-
  hw/tpm/tpm_tis_sysbus.c |  43 +
  tests/qtest/tpm-crb-test.c  |   2 +-
  tests/qtest/tpm-util.c  |   2 +-
  hw/arm/Kconfig  |   1 +
  hw/riscv/Kconfig|   1 +
  hw/tpm/Kconfig  |   7 +-
  hw/tpm/meson.build  |   3 +
  hw/tpm/trace-events |   2 +-
  27 files changed, 703 insertions(+), 354 deletions(-)
  create mode 100644 hw/tpm/tpm_crb.h
  create mode 100644 hw/tpm/tpm_crb_common.c
  create mode 100644 hw/tpm/tpm_crb_sysbus.c

Re: [PATCH] linux-user: Remove pointless NULL check in clock_adjtime handling

2023-07-13 Thread Peter Maydell

I'll take this via target-arm.next unless there are any
objections...

thanks
-- PMM

On Tue, 4 Jul 2023 at 14:26, Peter Maydell  wrote:
>
> Laurent, ping? This patch has been reviewed.
>
> thanks
> -- PMM
>
> On Fri, 23 Jun 2023 at 15:44, Peter Maydell  wrote:
> >
> > In the code for TARGET_NR_clock_adjtime, we set the pointer phtx to
> > the address of the local variable htx.  This means it can never be
> > NULL, but later in the code we check it for NULL anyway.  Coverity
> > complains about this (CID 1507683) because the NULL check comes after
> > a call to clock_adjtime() that assumes it is non-NULL.
> >
> > Since phtx is always &htx, and is used only in three places, it's not
> > really necessary.  Remove it, bringing the code structure in to line
> > with that for TARGET_NR_clock_adjtime64, which already uses a simple
> > '&htx' when it wants a pointer to 'htx'.
> >
> > Signed-off-by: Peter Maydell 
> > ---
> >  linux-user/syscall.c | 12 +---
> >  1 file changed, 5 insertions(+), 7 deletions(-)
> >
> > diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> > index f2cb101d83c..7b2f9f7340e 100644
> > --- a/linux-user/syscall.c
> > +++ b/linux-user/syscall.c
> > @@ -10935,16 +10935,14 @@ static abi_long do_syscall1(CPUArchState 
> > *cpu_env, int num, abi_long arg1,
> >  #if defined(TARGET_NR_clock_adjtime) && defined(CONFIG_CLOCK_ADJTIME)
> >  case TARGET_NR_clock_adjtime:
> >  {
> > -struct timex htx, *phtx = &htx;
> > +struct timex htx;
> >
> > -if (target_to_host_timex(phtx, arg2) != 0) {
> > +if (target_to_host_timex(&htx, arg2) != 0) {
> >  return -TARGET_EFAULT;
> >  }
> > -ret = get_errno(clock_adjtime(arg1, phtx));
> > -if (!is_error(ret) && phtx) {
> > -if (host_to_target_timex(arg2, phtx) != 0) {
> > -return -TARGET_EFAULT;
> > -}
> > +ret = get_errno(clock_adjtime(arg1, &htx));
> > +if (!is_error(ret) && host_to_target_timex(arg2, &htx)) {
> > +return -TARGET_EFAULT;
> >  }
> >  }
> >  return ret;
> > --

Re: [PATCH 07/11] hw/arm/virt: add plug handler for TPM on SysBus

2023-07-13 Thread Stefan Berger





On 7/12/23 23:51, Joelle van Dyne wrote:

TPM needs to know its own base address in order to generate its DSDT
device entry.


This and the loongarch patch seem to have largely identical virt_tpm_plug 
functions.
Could they be consolidated in hw/tpm/virt.c  ?

   Stefan


Signed-off-by: Joelle van Dyne 
---
  hw/arm/virt.c | 37 +
  1 file changed, 37 insertions(+)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 7d9dbc2663..432148ef47 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2732,6 +2732,37 @@ static void virt_memory_plug(HotplugHandler *hotplug_dev,
   dev, &error_abort);
  }

+#ifdef CONFIG_TPM
+static void virt_tpm_plug(VirtMachineState *vms, TPMIf *tpmif)
+{
+PlatformBusDevice *pbus = PLATFORM_BUS_DEVICE(vms->platform_bus_dev);
+hwaddr pbus_base = vms->memmap[VIRT_PLATFORM_BUS].base;
+SysBusDevice *sbdev = SYS_BUS_DEVICE(tpmif);
+MemoryRegion *sbdev_mr;
+hwaddr tpm_base;
+uint64_t tpm_size;
+
+if (!sbdev || !object_dynamic_cast(OBJECT(sbdev), TYPE_SYS_BUS_DEVICE)) {
+return;
+}
+
+tpm_base = platform_bus_get_mmio_addr(pbus, sbdev, 0);
+assert(tpm_base != -1);
+
+tpm_base += pbus_base;
+
+sbdev_mr = sysbus_mmio_get_region(sbdev, 0);
+tpm_size = memory_region_size(sbdev_mr);
+
+if (object_property_find(OBJECT(sbdev), "baseaddr")) {
+object_property_set_uint(OBJECT(sbdev), "baseaddr", tpm_base, NULL);
+}
+if (object_property_find(OBJECT(sbdev), "size")) {
+object_property_set_uint(OBJECT(sbdev), "size", tpm_size, NULL);
+}
+}
+#endif
+
  static void virt_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
  DeviceState *dev, Error **errp)
  {
@@ -2803,6 +2834,12 @@ static void virt_machine_device_plug_cb(HotplugHandler 
*hotplug_dev,
  vms->virtio_iommu_bdf = pci_get_bdf(pdev);
  create_virtio_iommu_dt_bindings(vms);
  }
+
+#ifdef CONFIG_TPM
+if (object_dynamic_cast(OBJECT(dev), TYPE_TPM_IF)) {
+virt_tpm_plug(vms, TPM_IF(dev));
+}
+#endif
  }

  static void virt_dimm_unplug_request(HotplugHandler *hotplug_dev,

Re: [PATCH 3/3] hw/arm/virt: Support host CPU type only when KVM or HVF is configured

2023-07-13 Thread Gavin Shan


Hi Connie,

On 7/13/23 22:46, Cornelia Huck wrote:

On Thu, Jul 13 2023, Gavin Shan  wrote:


The CPU type 'host-arm-cpu' class won't be registered until KVM or
HVF is configured in target/arm/cpu64.c. Support the corresponding
CPU type only when KVM or HVF is configured.

Signed-off-by: Gavin Shan 
---
  hw/arm/virt.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 43d7772ffd..ad28634445 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -217,7 +217,9 @@ static const char *valid_cpu_types[] = {
  #endif
  ARM_CPU_TYPE_NAME("cortex-a53"),
  ARM_CPU_TYPE_NAME("cortex-a57"),
+#if defined(CONFIG_KVM) || defined(CONFIG_HVF)
  ARM_CPU_TYPE_NAME("host"),
+#endif
  ARM_CPU_TYPE_NAME("max"),
  NULL
  };


Doesn't the check in parse_cpu_option() already catch the case where
the "host" cpu model isn't registered? I might be getting lost in the
code flow, though.



Right, it's guranteed that the needed CPU type (class) is registered by 
parse_cpu_option().
However, we have different story here. The CPU type invalidation intends to 
limit the CPU type
(class) into a range for the specific machine (board). Taking 
"cortex-a8-arm-cpu" as an example,
it's not expected by hw/arm/virt machines even it has been registered when we 
have CONFIG_TCG=y.
the list of supported CPU type (class) will be dumped by 
hw/core/machine.c::validate_cpu_type()
in PATCH[1], "host" is obviously invalid when we have CONFIG_KVM=n and 
CONFIG_HVF=n. We can't
tell user that "host" is supported, to confuse user.

Thanks,
Gavin

Re: [PATCH 2/2] migration: Make it clear that qemu_file_set_error() needs a negative value

2023-07-13 Thread Fabiano Rosas

Peter Maydell  writes:

> On Thu, 6 Jul 2023 at 20:52, Fabiano Rosas  wrote:
>>
>> The convention in qemu-file.c is to return a negative value on
>> error.
>>
>> The only place that could use qemu_file_set_error() to store a
>> positive value to f->last_error was vmstate_save() which has been
>> fixed in the previous patch.
>>
>> bdrv_inactivate_all() already returns a negative value on error.
>>
>> Document that qemu_file_set_error() needs -errno and alter the callers
>> to check ret < 0.
>>
>> Signed-off-by: Fabiano Rosas 
>> ---
>>  migration/qemu-file.c | 2 ++
>>  migration/savevm.c| 6 +++---
>>  2 files changed, 5 insertions(+), 3 deletions(-)
>>
>> diff --git a/migration/qemu-file.c b/migration/qemu-file.c
>> index acc282654a..8276bac248 100644
>> --- a/migration/qemu-file.c
>> +++ b/migration/qemu-file.c
>> @@ -222,6 +222,8 @@ int qemu_file_get_error(QEMUFile *f)
>>
>>  /*
>>   * Set the last error for stream f
>> + *
>> + * The error ('ret') should be in -errno format.
>>   */
>>  void qemu_file_set_error(QEMUFile *f, int ret)
>>  {
>> diff --git a/migration/savevm.c b/migration/savevm.c
>> index 95c2abf47c..f3c303ab74 100644
>> --- a/migration/savevm.c
>> +++ b/migration/savevm.c
>> @@ -1249,7 +1249,7 @@ void qemu_savevm_state_setup(QEMUFile *f)
>>  QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
>>  if (se->vmsd && se->vmsd->early_setup) {
>>  ret = vmstate_save(f, se, ms->vmdesc);
>> -if (ret) {
>> +if (ret < 0) {
>>  qemu_file_set_error(f, ret);
>
> You say qemu_file_set_error() should take an errno,
> but vmstate_save() doesn't return one. It will directly
> return whatever the VMStateInfo put, pre_save, etc hooks
> return, which isn't necessarily an errno. (Specifically,
> patch 1 in this series makes a .put hook return -1,
> rather than an errno. I'm guessing other implementations
> might too, though it's a bit hard to find them. A
> coccinelle script could probably locate them.)
>

All implementations return either 0, -1 or some errno; that one instance
from patch 1 returns 1. But you're right, those -1 are not really errno,
they are just "some negative value".

Since qemu-file.c puts the error through the error.c functions and those
call strerror(), all values that will go into qemu_file_set_error()
should be proper errnos.

I should probably audit users of qemu_file_set_error() instead and stop
using it for errors that have nothing to do with the actual migration
stream/QEMUFile. Currently it seems to have morphed into a mechanism to
record generic migration errors.

QEMU Summit Minutes 2023

2023-07-13 Thread Peter Maydell

QEMU Summit Minutes 2023


As usual, we held a QEMU Summit meeting at KVM Forum.  This is an
invite-only meeting for the most active maintainers and submaintainers
in the project, and we discuss various project-wide issues, usually
process stuff. We then post the minutes of the meeting to the list as
a jumping off point for wider discussion and for those who weren't
able to attend.

Attendees
=

"Peter Maydell" 
"Alex Bennée" 
"Kevin Wolf" 
"Thomas Huth" 
"Markus Armbruster" 
"Mark Cave-Ayland" 
"Philippe Mathieu-Daudé" 
"Daniel P. Berrangé" 
"Richard Henderson" 
"Michael S. Tsirkin" 
"Stefan Hajnoczi" 
"Alex Graf" 
"Gerd Hoffmann" 
"Paolo Bonzini" 
"Michael Roth" 

Topic 1: Dealing with tree wide changes
===

Mark Cave-Ayland raised concerns that tree wide changes often get
stuck because maintainers are conservative about merging code that
touches other subsystems and doesn't have review.  He mentioned a
couple of cases of PC refactoring which had been held up and
languished on the list due to lack of review time. It can be hard to
get everything in the change reviewed, and then hard to get the change
merged, especially if it still has parts that weren't reviewed by
anybody.

Alex Bennée mentioned that maintainers can always give an Acked-by and
then rely on someone else doing the review. But even getting
Acked-by's can take time and we still have a problem with absent
maintainers.

In a brief diversion Markus mused that having more automated checking
for things like QAPI changes would help reduce the maintainer load for
the more mechanical changes.

It was pointed out we should be more accepting of merging changes
without explicit maintainer approval where the changes are surface
level system wide API changes rather than touching the guts of any
particular subsystem. This avoids the sometimes onerous task of
splitting mechanical tree-wide changes along subsystem boundaries.
A delay of one-week + send a ping + one-week was suggested as
sufficient time for maintainers to reply if they care specifically
about the series.

Alex Graf suggested that we should hold different standards of review
requirements depending on the importance of the sub-system. We should
not hold up code because a minor underused subsystem didn't get
signoff. We already informally do this but we don't make it very clear
so it can be hard to tell what is and isn't OK to let through without
review.

Topic 2: Are we happy with the email workflow?
==

This was a topic to see if there was any consensus among maintainers
about the long-term acceptability of sticking with email for patch
submission and review -- in five years' time, if we're still doing it
the same way, how would we feel about it?

One area where we did get consensus was that now that we're doing CI
on gitlab we can change pull requests from maintainers from via-email
to gitlab merge requests. This would hopefully mean that instead of
the release-manager having to tell gitlab to do a merge and then
reporting back the results of any CI failures, the maintainer
could directly see the CI results and deal with fixing up failures
and resubmitting without involving the release manager. (This
may have the disbenefit that there isn't a single person any
more who looks at all the CI results and gets a sense of whether
particular test cases have pre-existing intermittent failures.)

There was less agreement on the main problem of reviewing code.
On the positive side:
 - everyone acknowledged that the email workflow was a barrier to new
   contributors
 - email is not awkward just for newcomers -- many regular
   developers have to deal with corporate mail systems, firewalls,
   etc, that make the email workflow more awkward than it was
   when Linux (and subsequently QEMU) first adopted it decades ago
 - a web UI means that unreviewed submissions are easier to track,
   rather than being simply ignored on the mailing list
But on the negative side:
 - gitlab is not set up for a "submaintainer tree" kind of workflow,
   so patches would go directly into the main tree and get no
   per-subsystem testing beyond whatever our CI can cover
 - gitlab doesn't handle adding Reviewed-by: and similar tags
 - email provides an automatic archive of the historical code
   review conversation; gitlab doesn't do this as well
 - it would increase the degree to which we might have a lock-in
   problem with gitlab (we could switch away, but it gets more painful)
 - it has the potential to be a bigger barrier to new contributors
   getting started with reviewing, compared to "just send an email"
 - it would probably burn the project's CI minutes more quickly
   as we would do runs per-submission, not just per-pullreq
 - might increase the awkwardness of the "some contributors/bug
   reporters/people interested in a patch are only notifiable
   by gitlab handle, and some only by email, and you can

Re: [PATCH 01/11] tpm_crb: refactor common code

2023-07-13 Thread Stefan Berger





On 7/12/23 23:51, Joelle van Dyne wrote:

In preparation for the SysBus variant, we move common code styled
after the TPM TIS devices.

To maintain compatibility, we do not rename the existing tpm-crb
device.

Signed-off-by: Joelle van Dyne 
---
  docs/specs/tpm.rst  |   1 +
  hw/tpm/tpm_crb.h|  76 +++
  hw/tpm/tpm_crb.c| 270 ++--
  hw/tpm/tpm_crb_common.c | 218 
  hw/tpm/meson.build  |   1 +
  hw/tpm/trace-events |   2 +-
  6 files changed, 333 insertions(+), 235 deletions(-)
  create mode 100644 hw/tpm/tpm_crb.h
  create mode 100644 hw/tpm/tpm_crb_common.c

diff --git a/docs/specs/tpm.rst b/docs/specs/tpm.rst
index efe124a148..2bc29c9804 100644
--- a/docs/specs/tpm.rst
+++ b/docs/specs/tpm.rst
@@ -45,6 +45,7 @@ operating system.

  QEMU files related to TPM CRB interface:
   - ``hw/tpm/tpm_crb.c``
+ - ``hw/tpm/tpm_crb_common.c``


If you could add the command line to use for Windows on AARCH64 to this 
document in 11/11
that would be helpful because what is there right now ony works for Linux iirc.

Regarding this patch here:

Reviewed-by: Stefan Berger 




  SPAPR interface
  ---
diff --git a/hw/tpm/tpm_crb.h b/hw/tpm/tpm_crb.h
new file mode 100644
index 00..da3a0cf256
--- /dev/null
+++ b/hw/tpm/tpm_crb.h
@@ -0,0 +1,76 @@
+/*
+ * tpm_crb.h - QEMU's TPM CRB interface emulator
+ *
+ * Copyright (c) 2018 Red Hat, Inc.
+ *
+ * Authors:
+ *   Marc-André Lureau 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ * tpm_crb is a device for TPM 2.0 Command Response Buffer (CRB) Interface
+ * as defined in TCG PC Client Platform TPM Profile (PTP) Specification
+ * Family “2.0” Level 00 Revision 01.03 v22
+ */
+#ifndef TPM_TPM_CRB_H
+#define TPM_TPM_CRB_H
+
+#include "exec/memory.h"
+#include "hw/acpi/tpm.h"
+#include "sysemu/tpm_backend.h"
+#include "tpm_ppi.h"
+
+#define CRB_CTRL_CMD_SIZE (TPM_CRB_ADDR_SIZE - A_CRB_DATA_BUFFER)
+
+typedef struct TPMCRBState {
+TPMBackend *tpmbe;
+TPMBackendCmd cmd;
+uint32_t regs[TPM_CRB_R_MAX];
+MemoryRegion mmio;
+MemoryRegion cmdmem;
+
+size_t be_buffer_size;
+
+bool ppi_enabled;
+TPMPPI ppi;
+} TPMCRBState;
+
+#define CRB_INTF_TYPE_CRB_ACTIVE 0b1
+#define CRB_INTF_VERSION_CRB 0b1
+#define CRB_INTF_CAP_LOCALITY_0_ONLY 0b0
+#define CRB_INTF_CAP_IDLE_FAST 0b0
+#define CRB_INTF_CAP_XFER_SIZE_64 0b11
+#define CRB_INTF_CAP_FIFO_NOT_SUPPORTED 0b0
+#define CRB_INTF_CAP_CRB_SUPPORTED 0b1
+#define CRB_INTF_IF_SELECTOR_CRB 0b1
+
+enum crb_loc_ctrl {
+CRB_LOC_CTRL_REQUEST_ACCESS = BIT(0),
+CRB_LOC_CTRL_RELINQUISH = BIT(1),
+CRB_LOC_CTRL_SEIZE = BIT(2),
+CRB_LOC_CTRL_RESET_ESTABLISHMENT_BIT = BIT(3),
+};
+
+enum crb_ctrl_req {
+CRB_CTRL_REQ_CMD_READY = BIT(0),
+CRB_CTRL_REQ_GO_IDLE = BIT(1),
+};
+
+enum crb_start {
+CRB_START_INVOKE = BIT(0),
+};
+
+enum crb_cancel {
+CRB_CANCEL_INVOKE = BIT(0),
+};
+
+#define TPM_CRB_NO_LOCALITY 0xff
+
+void tpm_crb_request_completed(TPMCRBState *s, int ret);
+enum TPMVersion tpm_crb_get_version(TPMCRBState *s);
+int tpm_crb_pre_save(TPMCRBState *s);
+void tpm_crb_reset(TPMCRBState *s, uint64_t baseaddr);
+void tpm_crb_init_memory(Object *obj, TPMCRBState *s, Error **errp);
+
+#endif /* TPM_TPM_CRB_H */
diff --git a/hw/tpm/tpm_crb.c b/hw/tpm/tpm_crb.c
index ea930da545..3ef4977fb5 100644
--- a/hw/tpm/tpm_crb.c
+++ b/hw/tpm/tpm_crb.c
@@ -31,257 +31,62 @@
  #include "tpm_ppi.h"
  #include "trace.h"
  #include "qom/object.h"
+#include "tpm_crb.h"

  struct CRBState {
  DeviceState parent_obj;

-TPMBackend *tpmbe;
-TPMBackendCmd cmd;
-uint32_t regs[TPM_CRB_R_MAX];
-MemoryRegion mmio;
-MemoryRegion cmdmem;
-
-size_t be_buffer_size;
-
-bool ppi_enabled;
-TPMPPI ppi;
+TPMCRBState state;
  };
  typedef struct CRBState CRBState;

  DECLARE_INSTANCE_CHECKER(CRBState, CRB,
   TYPE_TPM_CRB)

-#define CRB_INTF_TYPE_CRB_ACTIVE 0b1
-#define CRB_INTF_VERSION_CRB 0b1
-#define CRB_INTF_CAP_LOCALITY_0_ONLY 0b0
-#define CRB_INTF_CAP_IDLE_FAST 0b0
-#define CRB_INTF_CAP_XFER_SIZE_64 0b11
-#define CRB_INTF_CAP_FIFO_NOT_SUPPORTED 0b0
-#define CRB_INTF_CAP_CRB_SUPPORTED 0b1
-#define CRB_INTF_IF_SELECTOR_CRB 0b1
-
-#define CRB_CTRL_CMD_SIZE (TPM_CRB_ADDR_SIZE - A_CRB_DATA_BUFFER)
-
-enum crb_loc_ctrl {
-CRB_LOC_CTRL_REQUEST_ACCESS = BIT(0),
-CRB_LOC_CTRL_RELINQUISH = BIT(1),
-CRB_LOC_CTRL_SEIZE = BIT(2),
-CRB_LOC_CTRL_RESET_ESTABLISHMENT_BIT = BIT(3),
-};
-
-enum crb_ctrl_req {
-CRB_CTRL_REQ_CMD_READY = BIT(0),
-CRB_CTRL_REQ_GO_IDLE = BIT(1),
-};
-
-enum crb_start {
-CRB_START_INVOKE = BIT(0),
-};
-
-enum crb_cancel {
-CRB_CANCEL_INVOKE = BIT(0),
-};
-
-#define TPM_CRB_NO_LOCALITY 0xff
-
-static uint64_t tpm_crb_mmio_read(void *opaque, hwaddr addr,
-  unsigned size)
-{
-CR

Re: [PATCH] hw/pci: Warn when ARI/SR-IOV device has non-zero Function number

2023-07-13 Thread Akihiko Odaki


On 2023/07/12 21:06, Michael S. Tsirkin wrote:

On Wed, Jul 12, 2023 at 08:50:32PM +0900, Akihiko Odaki wrote:

On 2023/07/12 20:46, Michael S. Tsirkin wrote:

On Wed, Jul 12, 2023 at 08:27:32PM +0900, Akihiko Odaki wrote:

Current SR/IOV implementations assume that hardcoded Function numbers
are always available and will not conflict. It is somewhat non-trivial
to make the Function numbers to use controllable to avoid Function
number conflicts so ensure there is only one PF to make the assumption
hold true.
Also warn when non-SR/IOV multifunction was attempted with ARI enabled;
ARI has the next Function number field register, and currently it's
hardcoded to 0, which prevents non-SR/IOV multifunction. It is
certainly possible to add a logic to determine the correct next Function
number according to the configuration, but it's not worth since all
ARI-capable devices are also SR/IOV devices, which do not support
multiple PFs as stated above.

Signed-off-by: Akihiko Odaki 


I am not really interested in adding this stuff.
The real thing to focus on is fixing ARI emulation, not
warning users about ways in which it's broken.


What do you think about multiple SR/IOV PFs? Do you think it's worth/easy
enough to fix SR/IOV code to support it? Otherwise it's not worth fixing ARI
since currently only SR/IOV devices implement it.


There's nothing especially hard about it. You can in particular just
assume the user knows what he's doing and not worry too much about
checking. Creating invalid configs might also come handy e.g. for debug.
The important thing, and that's missing ATM, is giving management
ability to find out TotalVFs, VF offset and VF stride, so it can
avoid creating these conflicts.

For igd maybe we should make VF offset and VF stride just 1 unconditionally -
I have no idea why it was made 2 ATM - could you check what does
real hardware do?


The current igb implementation match with real hardware. It is defined 
in the datasheet*, section 9.6.4.6. I don't know why it's 2 either.


* 
https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/82576eg-gbe-datasheet.pdf




Yes, warning at least is handy for
management debugging. It shouldn't be hard I think, but the
logic does tend to be O(n^2). Maybe add a flag to check,
and management developers can use that for debugging.




---
   hw/pci/pci.c | 59 +---
   1 file changed, 42 insertions(+), 17 deletions(-)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 784c02a182..50359a0f3a 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2124,23 +2124,48 @@ static void pci_qdev_realize(DeviceState *qdev, Error 
**errp)
   }
   }
-/*
- * A PCIe Downstream Port that do not have ARI Forwarding enabled must
- * associate only Device 0 with the device attached to the bus
- * representing the Link from the Port (PCIe base spec rev 4.0 ver 0.3,
- * sec 7.3.1).
- * With ARI, PCI_SLOT() can return non-zero value as the traditional
- * 5-bit Device Number and 3-bit Function Number fields in its associated
- * Routing IDs, Requester IDs and Completer IDs are interpreted as a
- * single 8-bit Function Number. Hence, ignore ARI capable devices.
- */
-if (pci_is_express(pci_dev) &&
-!pcie_find_capability(pci_dev, PCI_EXT_CAP_ID_ARI) &&
-pcie_has_upstream_port(pci_dev) &&
-PCI_SLOT(pci_dev->devfn)) {
-warn_report("PCI: slot %d is not valid for %s,"
-" parent device only allows plugging into slot 0.",
-PCI_SLOT(pci_dev->devfn), pci_dev->name);
+if (pci_is_express(pci_dev)) {
+/*
+ * A PCIe Downstream Port that do not have ARI Forwarding enabled must
+ * associate only Device 0 with the device attached to the bus
+ * representing the Link from the Port (PCIe base spec rev 4.0 ver 0.3,
+ * sec 7.3.1).
+ * With ARI, PCI_SLOT() can return non-zero value as the traditional
+ * 5-bit Device Number and 3-bit Function Number fields in its
+ * associated Routing IDs, Requester IDs and Completer IDs are
+ * interpreted as a single 8-bit Function Number. Hence, ignore ARI
+ * capable devices.
+ */
+if (!pcie_find_capability(pci_dev, PCI_EXT_CAP_ID_ARI) &&
+pcie_has_upstream_port(pci_dev) &&
+PCI_SLOT(pci_dev->devfn)) {
+warn_report("PCI: slot %d is not valid for %s,"
+" parent device only allows plugging into slot 0.",
+PCI_SLOT(pci_dev->devfn), pci_dev->name);
+}
+
+/*
+ * Current SR/IOV implementations assume that hardcoded Function 
numbers
+ * are always available. Ensure there is only one PF to make the
+ * assumption hold true.
+ */
+if (pcie_find_capability(pci_dev, PCI_EXT_CAP_ID_SRIOV) &&
+PCI_FUNC(pci_dev->devfn)) {
+warn_report("PCI

Re: [PATCH 04/11] tpm_crb: use a single read-as-mem/write-as-mmio mapping

2023-07-13 Thread Stefan Berger





On 7/12/23 23:51, Joelle van Dyne wrote:

On Apple Silicon, when Windows performs a LDP on the CRB MMIO space,
the exception is not decoded by hardware and we cannot trap the MMIO
read. This led to the idea from @agraf to use the same mapping type as
ROM devices: namely that reads should be seen as memory type and
writes should trap as MMIO.

Once that was done, the second memory mapping of the command buffer
region was redundent and was removed.

A note about the removal of the read trap for `CRB_LOC_STATE`:
The only usage was to return the most up-to-date value for
`tpmEstablished`. However, `tpmEstablished` is only set when a
TPM2_HashStart operation is called which only exists for locality 4.
Indeed, the comment for the write handler of `CRB_LOC_CTRL` makes the
same argument for why it is not calling the backend to reset the
`tpmEstablished` bit. As this bit is unused, we do not need to worry
about updating it for reads.

Signed-off-by: Joelle van Dyne 
---
  hw/tpm/tpm_crb.h|   2 -
  hw/tpm/tpm_crb.c|   3 -
  hw/tpm/tpm_crb_common.c | 124 
  3 files changed, 63 insertions(+), 66 deletions(-)

diff --git a/hw/tpm/tpm_crb.h b/hw/tpm/tpm_crb.h
index da3a0cf256..7cdd37335f 100644
--- a/hw/tpm/tpm_crb.h
+++ b/hw/tpm/tpm_crb.h
@@ -26,9 +26,7 @@
  typedef struct TPMCRBState {
  TPMBackend *tpmbe;
  TPMBackendCmd cmd;
-uint32_t regs[TPM_CRB_R_MAX];
  MemoryRegion mmio;
-MemoryRegion cmdmem;

  size_t be_buffer_size;

diff --git a/hw/tpm/tpm_crb.c b/hw/tpm/tpm_crb.c
index 598c3e0161..07c6868d8d 100644
--- a/hw/tpm/tpm_crb.c
+++ b/hw/tpm/tpm_crb.c
@@ -68,7 +68,6 @@ static const VMStateDescription vmstate_tpm_crb_none = {
  .name = "tpm-crb",
  .pre_save = tpm_crb_none_pre_save,
  .fields = (VMStateField[]) {
-VMSTATE_UINT32_ARRAY(state.regs, CRBState, TPM_CRB_R_MAX),


This has to stay here otherwise we cannot restart VMs from saved state once 
QEMU is upgraded.

2023-07-13T14:15:43.997718Z qemu-system-x86_64: Unknown ramblock "tpm-crb-cmd", 
cannot accept migration
2023-07-13T14:15:43.997813Z qemu-system-x86_64: error while loading state for 
instance 0x0 of device 'ram'
2023-07-13T14:15:43.997841Z qemu-system-x86_64: load of migration failed: 
Invalid argument


   Stefan

Re: [PATCH 04/11] tpm_crb: use a single read-as-mem/write-as-mmio mapping

2023-07-13 Thread Peter Maydell

On Thu, 13 Jul 2023 at 15:18, Stefan Berger  wrote:
>
>
>
> On 7/12/23 23:51, Joelle van Dyne wrote:
> > On Apple Silicon, when Windows performs a LDP on the CRB MMIO space,
> > the exception is not decoded by hardware and we cannot trap the MMIO
> > read. This led to the idea from @agraf to use the same mapping type as
> > ROM devices: namely that reads should be seen as memory type and
> > writes should trap as MMIO.

These are hardware registers, right? Windows shouldn't
really be doing LDP to those if it expects to be able to run
in a VM...

> > Once that was done, the second memory mapping of the command buffer
> > region was redundent and was removed.
> >
> > A note about the removal of the read trap for `CRB_LOC_STATE`:
> > The only usage was to return the most up-to-date value for
> > `tpmEstablished`. However, `tpmEstablished` is only set when a
> > TPM2_HashStart operation is called which only exists for locality 4.
> > Indeed, the comment for the write handler of `CRB_LOC_CTRL` makes the
> > same argument for why it is not calling the backend to reset the
> > `tpmEstablished` bit. As this bit is unused, we do not need to worry
> > about updating it for reads.
> >
> > Signed-off-by: Joelle van Dyne 
> > ---
> >   hw/tpm/tpm_crb.h|   2 -
> >   hw/tpm/tpm_crb.c|   3 -
> >   hw/tpm/tpm_crb_common.c | 124 
> >   3 files changed, 63 insertions(+), 66 deletions(-)
> >
> > diff --git a/hw/tpm/tpm_crb.h b/hw/tpm/tpm_crb.h
> > index da3a0cf256..7cdd37335f 100644
> > --- a/hw/tpm/tpm_crb.h
> > +++ b/hw/tpm/tpm_crb.h
> > @@ -26,9 +26,7 @@
> >   typedef struct TPMCRBState {
> >   TPMBackend *tpmbe;
> >   TPMBackendCmd cmd;
> > -uint32_t regs[TPM_CRB_R_MAX];
> >   MemoryRegion mmio;
> > -MemoryRegion cmdmem;
> >
> >   size_t be_buffer_size;
> >
> > diff --git a/hw/tpm/tpm_crb.c b/hw/tpm/tpm_crb.c
> > index 598c3e0161..07c6868d8d 100644
> > --- a/hw/tpm/tpm_crb.c
> > +++ b/hw/tpm/tpm_crb.c
> > @@ -68,7 +68,6 @@ static const VMStateDescription vmstate_tpm_crb_none = {
> >   .name = "tpm-crb",
> >   .pre_save = tpm_crb_none_pre_save,
> >   .fields = (VMStateField[]) {
> > -VMSTATE_UINT32_ARRAY(state.regs, CRBState, TPM_CRB_R_MAX),
>
> This has to stay here otherwise we cannot restart VMs from saved state once 
> QEMU is upgraded.
>
> 2023-07-13T14:15:43.997718Z qemu-system-x86_64: Unknown ramblock 
> "tpm-crb-cmd", cannot accept migration
> 2023-07-13T14:15:43.997813Z qemu-system-x86_64: error while loading state for 
> instance 0x0 of device 'ram'
> 2023-07-13T14:15:43.997841Z qemu-system-x86_64: load of migration failed: 
> Invalid argument

More generally, for migration compatibility in the other
direction you need to use memory_region_init_rom_device_nomigrate()
and make sure you keep migrating the data via this, not
via the MemoryRegion.

I'm not a super-fan of hacking around the fact that LDP
to hardware registers isn't supported in specific device
models, though...

thanks
-- PMM

Re: [PATCH 04/11] tpm_crb: use a single read-as-mem/write-as-mmio mapping

2023-07-13 Thread Stefan Berger





On 7/13/23 10:50, Peter Maydell wrote:

On Thu, 13 Jul 2023 at 15:18, Stefan Berger  wrote:




On 7/12/23 23:51, Joelle van Dyne wrote:

On Apple Silicon, when Windows performs a LDP on the CRB MMIO space,
the exception is not decoded by hardware and we cannot trap the MMIO
read. This led to the idea from @agraf to use the same mapping type as
ROM devices: namely that reads should be seen as memory type and
writes should trap as MMIO.



+++ b/hw/tpm/tpm_crb.c
@@ -68,7 +68,6 @@ static const VMStateDescription vmstate_tpm_crb_none = {
   .name = "tpm-crb",
   .pre_save = tpm_crb_none_pre_save,
   .fields = (VMStateField[]) {
-VMSTATE_UINT32_ARRAY(state.regs, CRBState, TPM_CRB_R_MAX),


This has to stay here otherwise we cannot restart VMs from saved state once 
QEMU is upgraded.

2023-07-13T14:15:43.997718Z qemu-system-x86_64: Unknown ramblock "tpm-crb-cmd", 
cannot accept migration
2023-07-13T14:15:43.997813Z qemu-system-x86_64: error while loading state for 
instance 0x0 of device 'ram'
2023-07-13T14:15:43.997841Z qemu-system-x86_64: load of migration failed: 
Invalid argument


More generally, for migration compatibility in the other
direction you need to use memory_region_init_rom_device_nomigrate()
and make sure you keep migrating the data via this, not
via the MemoryRegion.

I'm not a super-fan of hacking around the fact that LDP
to hardware registers isn't supported in specific device
models, though...


What does this mean for this effort here?

   Stefan



thanks
-- PMM

Re: [PATCH 07/11] hw/arm/virt: add plug handler for TPM on SysBus

2023-07-13 Thread Peter Maydell

On Thu, 13 Jul 2023 at 04:52, Joelle van Dyne  wrote:
>
> TPM needs to know its own base address in order to generate its DSDT
> device entry.
>
> Signed-off-by: Joelle van Dyne 
> ---
>  hw/arm/virt.c | 37 +
>  1 file changed, 37 insertions(+)
>
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 7d9dbc2663..432148ef47 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -2732,6 +2732,37 @@ static void virt_memory_plug(HotplugHandler 
> *hotplug_dev,
>   dev, &error_abort);
>  }
>
> +#ifdef CONFIG_TPM
> +static void virt_tpm_plug(VirtMachineState *vms, TPMIf *tpmif)
> +{
> +PlatformBusDevice *pbus = PLATFORM_BUS_DEVICE(vms->platform_bus_dev);
> +hwaddr pbus_base = vms->memmap[VIRT_PLATFORM_BUS].base;
> +SysBusDevice *sbdev = SYS_BUS_DEVICE(tpmif);
> +MemoryRegion *sbdev_mr;
> +hwaddr tpm_base;
> +uint64_t tpm_size;
> +
> +if (!sbdev || !object_dynamic_cast(OBJECT(sbdev), TYPE_SYS_BUS_DEVICE)) {
> +return;
> +}
> +
> +tpm_base = platform_bus_get_mmio_addr(pbus, sbdev, 0);
> +assert(tpm_base != -1);
> +
> +tpm_base += pbus_base;
> +
> +sbdev_mr = sysbus_mmio_get_region(sbdev, 0);
> +tpm_size = memory_region_size(sbdev_mr);
> +
> +if (object_property_find(OBJECT(sbdev), "baseaddr")) {
> +object_property_set_uint(OBJECT(sbdev), "baseaddr", tpm_base, NULL);
> +}
> +if (object_property_find(OBJECT(sbdev), "size")) {
> +object_property_set_uint(OBJECT(sbdev), "size", tpm_size, NULL);
> +}
> +}
> +#endif

I do not like the "platform bus" at all -- it is a nasty hack.
If the virt board needs a memory mapped TPM device it should probably
just create one, the same way we create our other memory mapped
devices. But...

How are TPM devices typically set up/visible to the guest on
real Arm server hardware ? Should this be a sysbus device at all?

thanks
-- PMM

Re: [PATCH 04/11] tpm_crb: use a single read-as-mem/write-as-mmio mapping

2023-07-13 Thread Peter Maydell

On Thu, 13 Jul 2023 at 16:28, Stefan Berger  wrote:
>
>
>
> On 7/13/23 10:50, Peter Maydell wrote:
> > On Thu, 13 Jul 2023 at 15:18, Stefan Berger  wrote:
> >>
> >>
> >>
> >> On 7/12/23 23:51, Joelle van Dyne wrote:
> >>> On Apple Silicon, when Windows performs a LDP on the CRB MMIO space,
> >>> the exception is not decoded by hardware and we cannot trap the MMIO
> >>> read. This led to the idea from @agraf to use the same mapping type as
> >>> ROM devices: namely that reads should be seen as memory type and
> >>> writes should trap as MMIO.
>
> >>> +++ b/hw/tpm/tpm_crb.c
> >>> @@ -68,7 +68,6 @@ static const VMStateDescription vmstate_tpm_crb_none = {
> >>>.name = "tpm-crb",
> >>>.pre_save = tpm_crb_none_pre_save,
> >>>.fields = (VMStateField[]) {
> >>> -VMSTATE_UINT32_ARRAY(state.regs, CRBState, TPM_CRB_R_MAX),
> >>
> >> This has to stay here otherwise we cannot restart VMs from saved state 
> >> once QEMU is upgraded.
> >>
> >> 2023-07-13T14:15:43.997718Z qemu-system-x86_64: Unknown ramblock 
> >> "tpm-crb-cmd", cannot accept migration
> >> 2023-07-13T14:15:43.997813Z qemu-system-x86_64: error while loading state 
> >> for instance 0x0 of device 'ram'
> >> 2023-07-13T14:15:43.997841Z qemu-system-x86_64: load of migration failed: 
> >> Invalid argument
> >
> > More generally, for migration compatibility in the other
> > direction you need to use memory_region_init_rom_device_nomigrate()
> > and make sure you keep migrating the data via this, not
> > via the MemoryRegion.
> >
> > I'm not a super-fan of hacking around the fact that LDP
> > to hardware registers isn't supported in specific device
> > models, though...
>
> What does this mean for this effort here?

Usually we say "fix the guest to not try to access hardware
registers with silly load/store instruction types". The other
option would be "put in a large amount of effort to support
emulating those instructions in QEMU userspace when KVM/HVF/etc
trap and punt them to us". For the last decade or so we have
taken the first of these approaches :-)

thanks
-- PMM

Re: [PATCH 02/11] tpm_crb: CTRL_RSP_ADDR is 64-bits wide

2023-07-13 Thread Stefan Berger





On 7/12/23 23:51, Joelle van Dyne wrote:

The register is actually 64-bits but in order to make this more clear
than the specification, we define two 32-bit registers:
CTRL_RSP_LADDR and CTRL_RSP_HADDR to match the CTRL_CMD_* naming. This
deviates from the specs but is way more clear.

Previously, the only CRB device uses a fixed system address so this
was not an issue. However, once we support SysBus CRB device, the
address can be anywhere in 64-bit space.

Signed-off-by: Joelle van Dyne 


Reviewed-by: Stefan Berger 


---
  include/hw/acpi/tpm.h  | 3 ++-
  hw/tpm/tpm_crb_common.c| 3 ++-
  tests/qtest/tpm-crb-test.c | 2 +-
  tests/qtest/tpm-util.c | 2 +-
  4 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/include/hw/acpi/tpm.h b/include/hw/acpi/tpm.h
index 579c45f5ba..f60bfe2789 100644
--- a/include/hw/acpi/tpm.h
+++ b/include/hw/acpi/tpm.h
@@ -174,7 +174,8 @@ REG32(CRB_CTRL_CMD_SIZE, 0x58)
  REG32(CRB_CTRL_CMD_LADDR, 0x5C)
  REG32(CRB_CTRL_CMD_HADDR, 0x60)
  REG32(CRB_CTRL_RSP_SIZE, 0x64)
-REG32(CRB_CTRL_RSP_ADDR, 0x68)
+REG32(CRB_CTRL_RSP_LADDR, 0x68)
+REG32(CRB_CTRL_RSP_HADDR, 0x6C)
  REG32(CRB_DATA_BUFFER, 0x80)

  #define TPM_CRB_ADDR_BASE   0xFED4
diff --git a/hw/tpm/tpm_crb_common.c b/hw/tpm/tpm_crb_common.c
index 4c173affb6..228e2d0faf 100644
--- a/hw/tpm/tpm_crb_common.c
+++ b/hw/tpm/tpm_crb_common.c
@@ -199,7 +199,8 @@ void tpm_crb_reset(TPMCRBState *s, uint64_t baseaddr)
  s->regs[R_CRB_CTRL_CMD_LADDR] = (uint32_t)baseaddr;
  s->regs[R_CRB_CTRL_CMD_HADDR] = (uint32_t)(baseaddr >> 32);
  s->regs[R_CRB_CTRL_RSP_SIZE] = CRB_CTRL_CMD_SIZE;
-s->regs[R_CRB_CTRL_RSP_ADDR] = (uint32_t)baseaddr;
+s->regs[R_CRB_CTRL_RSP_LADDR] = (uint32_t)baseaddr;
+s->regs[R_CRB_CTRL_RSP_HADDR] = (uint32_t)(baseaddr >> 32);

  s->be_buffer_size = MIN(tpm_backend_get_buffer_size(s->tpmbe),
  CRB_CTRL_CMD_SIZE);
diff --git a/tests/qtest/tpm-crb-test.c b/tests/qtest/tpm-crb-test.c
index 396ae3f91c..9d30fe8293 100644
--- a/tests/qtest/tpm-crb-test.c
+++ b/tests/qtest/tpm-crb-test.c
@@ -28,7 +28,7 @@ static void tpm_crb_test(const void *data)
  uint32_t csize = readl(TPM_CRB_ADDR_BASE + A_CRB_CTRL_CMD_SIZE);
  uint64_t caddr = readq(TPM_CRB_ADDR_BASE + A_CRB_CTRL_CMD_LADDR);
  uint32_t rsize = readl(TPM_CRB_ADDR_BASE + A_CRB_CTRL_RSP_SIZE);
-uint64_t raddr = readq(TPM_CRB_ADDR_BASE + A_CRB_CTRL_RSP_ADDR);
+uint64_t raddr = readq(TPM_CRB_ADDR_BASE + A_CRB_CTRL_RSP_LADDR);
  uint8_t locstate = readb(TPM_CRB_ADDR_BASE + A_CRB_LOC_STATE);
  uint32_t locctrl = readl(TPM_CRB_ADDR_BASE + A_CRB_LOC_CTRL);
  uint32_t locsts = readl(TPM_CRB_ADDR_BASE + A_CRB_LOC_STS);
diff --git a/tests/qtest/tpm-util.c b/tests/qtest/tpm-util.c
index 1c0319e6e7..dd02057fc0 100644
--- a/tests/qtest/tpm-util.c
+++ b/tests/qtest/tpm-util.c
@@ -25,7 +25,7 @@ void tpm_util_crb_transfer(QTestState *s,
 unsigned char *rsp, size_t rsp_size)
  {
  uint64_t caddr = qtest_readq(s, TPM_CRB_ADDR_BASE + A_CRB_CTRL_CMD_LADDR);
-uint64_t raddr = qtest_readq(s, TPM_CRB_ADDR_BASE + A_CRB_CTRL_RSP_ADDR);
+uint64_t raddr = qtest_readq(s, TPM_CRB_ADDR_BASE + A_CRB_CTRL_RSP_LADDR);

  qtest_writeb(s, TPM_CRB_ADDR_BASE + A_CRB_LOC_CTRL, 1);

Re: [PATCH] util/interval-tree: Avoid race conditions without optimization

2023-07-13 Thread Richard Henderson


On 7/13/23 12:32, Peter Maydell wrote:

On Fri, 7 Jul 2023 at 11:30, Richard Henderson
 wrote:


Read the left and right trees once, so that the gating
tests are meaningful.  This was only a problem at -O0,
where the compiler didn't CSE the two reads.

Cc: qemu-sta...@nongnu.org
Signed-off-by: Richard Henderson 


Reviewed-by: Peter Maydell 

If this data structure is intended to support operations
being done on it while it's being mutated, shouldn't it
be using the atomic accessors, though? That would make
it clearer that you can't just undo the transformation
made by this patch.


Yes, it probably should.  I use qatomic_set() where the kernel used WRITE_ONCE, but there 
was no markup for the read side.



r~

Re: [PATCH 04/11] tpm_crb: use a single read-as-mem/write-as-mmio mapping

2023-07-13 Thread Stefan Berger





On 7/13/23 11:34, Peter Maydell wrote:

On Thu, 13 Jul 2023 at 16:28, Stefan Berger  wrote:




On 7/13/23 10:50, Peter Maydell wrote:

On Thu, 13 Jul 2023 at 15:18, Stefan Berger  wrote:




On 7/12/23 23:51, Joelle van Dyne wrote:

On Apple Silicon, when Windows performs a LDP on the CRB MMIO space,
the exception is not decoded by hardware and we cannot trap the MMIO
read. This led to the idea from @agraf to use the same mapping type as
ROM devices: namely that reads should be seen as memory type and
writes should trap as MMIO.



+++ b/hw/tpm/tpm_crb.c
@@ -68,7 +68,6 @@ static const VMStateDescription vmstate_tpm_crb_none = {
.name = "tpm-crb",
.pre_save = tpm_crb_none_pre_save,
.fields = (VMStateField[]) {
-VMSTATE_UINT32_ARRAY(state.regs, CRBState, TPM_CRB_R_MAX),


This has to stay here otherwise we cannot restart VMs from saved state once 
QEMU is upgraded.

2023-07-13T14:15:43.997718Z qemu-system-x86_64: Unknown ramblock "tpm-crb-cmd", 
cannot accept migration
2023-07-13T14:15:43.997813Z qemu-system-x86_64: error while loading state for 
instance 0x0 of device 'ram'
2023-07-13T14:15:43.997841Z qemu-system-x86_64: load of migration failed: 
Invalid argument


More generally, for migration compatibility in the other
direction you need to use memory_region_init_rom_device_nomigrate()
and make sure you keep migrating the data via this, not
via the MemoryRegion.

I'm not a super-fan of hacking around the fact that LDP
to hardware registers isn't supported in specific device
models, though...


What does this mean for this effort here?


Usually we say "fix the guest to not try to access hardware
registers with silly load/store instruction types". The other
option would be "put in a large amount of effort to support
emulating those instructions in QEMU userspace when KVM/HVF/etc
trap and punt them to us". For the last decade or so we have
taken the first of these approaches :-)


Is Microsoft likely to react to use telling them "fix the guest"?

   Stefan



thanks
-- PMM

Re: [PATCH V9 00/46] Live Update

2023-07-13 Thread Michael Galaxy


 Good morning,

On 7/10/23 10:10, Steven Sistare wrote:

On 6/12/2023 10:59 AM, Michael Galaxy wrote:

Hi Steve,
On 6/7/23 12:37, Steven Sistare wrote:

On 6/7/2023 11:55 AM, Michael Galaxy wrote:

Another option could be to expose "-migrate-mode-disable" (instead of enable) 
and just enable all 3 modes by default,
since we are already required to switch from "normal" mode to a CPR-specific 
mode when it is time to do a live update,
if the intention is to preserve the capability to completely prevent a running 
QEMU from using these modes
before the VM starts up.

- Michael

On 6/6/23 17:15, Michael Galaxy wrote:

Hi Steve,

In the current design you have, we have to specify both the command line parameter 
"-migrate-mode-enable cpr-reboot"
*and* issue the monitor command "migrate_set_parameter mode cpr-${mode}".

Is it possible to opt-in to the CPR mode just once over the monitor instead of 
having to specify it twice on the command line?
This would also match the live migration model: You do not need to necessarily "opt 
in" to live migration mode through
a command line parameter, you simply request it when you need to. Can CPR 
behave the same way?

This would also make switching over to a CPR-capable version of QEMU much 
simpler and would even make it work for
existing libvirt-managed guests as their command line parameters would no 
longer need to change. This would allow us to
simply power-off and power-on existing VMs to make them CPR-capable and then 
work on a libvirt patch later when
we're ready to do so.


Comments?

Hi Michael,
    Requiring -migrate-enable-mode allows qemu to initialize objects
differently, if necessary, so that migration for a mode is not blocked.
See callers of migrate_mode_enabled.  There is only one so far, in
ram_block_add.  If the mode is cpr-exec, then it creates anonymous ram
blocks using memfd_create, else using MAP_ANON.  In the V7 series, this
was controlled by a '-machine memfd-alloc=on' option.

migrate-enable-mode is more future proof for the user.  If something new must
initialize differently to support cpr, then it adds a call to 
migrate_mode_enabled,
and the command line remains the same.  However, I could be persuaded to go 
either way.

OK, so it is cpr-exec that needs this option (because of ram block allocation), 
not really cpr-reboot.
Could the option then be made to only be required for cpr-exec and not 
cpr-reboot, then,
since cpr-reboot doesn't require that consideration?

In a different forum Juan said this is a memory issue, so it should be expressed
as a memory related option.  So, I will delete -migrate-enable-mode and revert 
back
to -machine memfd-alloc, as defined in the V7 patch series.


Acknowledged. I'm going to try to get my reviewed-by's in soon. Sorry I 
haven't done it sooner.


We've finished testing these patches on our systems and are moving forward.



A secondary reason for -migrate-enable-mode is to support the only-cpr-capable
option.  It needs to know which mode will be used, in order to check a
mode-specific blocker list.

Still, only-cpr-capable is also optional. If and only if one needs this option, 
the mode could be
specified as part of the option itself, rather than requiring an extra command 
line parameter, no?

Yes, I will make that change.

- Steve


Acknowledged.

Re: [PATCH 04/11] tpm_crb: use a single read-as-mem/write-as-mmio mapping

2023-07-13 Thread Peter Maydell

On Thu, 13 Jul 2023 at 16:46, Stefan Berger  wrote:
> On 7/13/23 11:34, Peter Maydell wrote:
> > On Thu, 13 Jul 2023 at 16:28, Stefan Berger  wrote:
> >> On 7/13/23 10:50, Peter Maydell wrote:
> >>> I'm not a super-fan of hacking around the fact that LDP
> >>> to hardware registers isn't supported in specific device
> >>> models, though...
> >>
> >> What does this mean for this effort here?
> >
> > Usually we say "fix the guest to not try to access hardware
> > registers with silly load/store instruction types". The other
> > option would be "put in a large amount of effort to support
> > emulating those instructions in QEMU userspace when KVM/HVF/etc
> > trap and punt them to us". For the last decade or so we have
> > taken the first of these approaches :-)
>
> Is Microsoft likely to react to use telling them "fix the guest"?

They have on occasion in the past, yes.

The other outstanding question here is if this TPM device
should be a sysbus one at all (i.e. not i2c), which might
render this part moot.

thanks
-- PMM

Re: [PATCH 03/11] tpm_ppi: refactor memory space initialization

2023-07-13 Thread Stefan Berger





On 7/12/23 23:51, Joelle van Dyne wrote:

Instead of calling `memory_region_add_subregion` directly, we defer to
the caller to do it. This allows us to re-use the code for a SysBus
device.

Signed-off-by: Joelle van Dyne 


Reviewed-by: Stefan Berger 


---
  hw/tpm/tpm_ppi.h| 10 +++---
  hw/tpm/tpm_crb.c|  4 ++--
  hw/tpm/tpm_crb_common.c |  3 +++
  hw/tpm/tpm_ppi.c|  5 +
  hw/tpm/tpm_tis_isa.c|  5 +++--
  5 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/hw/tpm/tpm_ppi.h b/hw/tpm/tpm_ppi.h
index bf5d4a300f..30863c6438 100644
--- a/hw/tpm/tpm_ppi.h
+++ b/hw/tpm/tpm_ppi.h
@@ -20,17 +20,13 @@ typedef struct TPMPPI {
  } TPMPPI;

  /**
- * tpm_ppi_init:
+ * tpm_ppi_init_memory:
   * @tpmppi: a TPMPPI
- * @m: the address-space / MemoryRegion to use
- * @addr: the address of the PPI region
   * @obj: the owner object
   *
- * Register the TPM PPI memory region at @addr on the given address
- * space for the object @obj.
+ * Creates the TPM PPI memory region.
   **/
-void tpm_ppi_init(TPMPPI *tpmppi, MemoryRegion *m,
-  hwaddr addr, Object *obj);
+void tpm_ppi_init_memory(TPMPPI *tpmppi, Object *obj);

  /**
   * tpm_ppi_reset:
diff --git a/hw/tpm/tpm_crb.c b/hw/tpm/tpm_crb.c
index 3ef4977fb5..598c3e0161 100644
--- a/hw/tpm/tpm_crb.c
+++ b/hw/tpm/tpm_crb.c
@@ -107,8 +107,8 @@ static void tpm_crb_none_realize(DeviceState *dev, Error 
**errp)
  TPM_CRB_ADDR_BASE + sizeof(s->state.regs), &s->state.cmdmem);

  if (s->state.ppi_enabled) {
-tpm_ppi_init(&s->state.ppi, get_system_memory(),
- TPM_PPI_ADDR_BASE, OBJECT(s));
+memory_region_add_subregion(get_system_memory(),
+TPM_PPI_ADDR_BASE, &s->state.ppi.ram);
  }

  if (xen_enabled()) {
diff --git a/hw/tpm/tpm_crb_common.c b/hw/tpm/tpm_crb_common.c
index 228e2d0faf..e56e910670 100644
--- a/hw/tpm/tpm_crb_common.c
+++ b/hw/tpm/tpm_crb_common.c
@@ -216,4 +216,7 @@ void tpm_crb_init_memory(Object *obj, TPMCRBState *s, Error 
**errp)
  "tpm-crb-mmio", sizeof(s->regs));
  memory_region_init_ram(&s->cmdmem, obj,
  "tpm-crb-cmd", CRB_CTRL_CMD_SIZE, errp);
+if (s->ppi_enabled) {
+tpm_ppi_init_memory(&s->ppi, obj);
+}
  }
diff --git a/hw/tpm/tpm_ppi.c b/hw/tpm/tpm_ppi.c
index 7f74e26ec6..40cab59afa 100644
--- a/hw/tpm/tpm_ppi.c
+++ b/hw/tpm/tpm_ppi.c
@@ -44,14 +44,11 @@ void tpm_ppi_reset(TPMPPI *tpmppi)
  }
  }

-void tpm_ppi_init(TPMPPI *tpmppi, MemoryRegion *m,
-  hwaddr addr, Object *obj)
+void tpm_ppi_init_memory(TPMPPI *tpmppi, Object *obj)
  {
  tpmppi->buf = qemu_memalign(qemu_real_host_page_size(),
  HOST_PAGE_ALIGN(TPM_PPI_ADDR_SIZE));
  memory_region_init_ram_device_ptr(&tpmppi->ram, obj, "tpm-ppi",
TPM_PPI_ADDR_SIZE, tpmppi->buf);
  vmstate_register_ram(&tpmppi->ram, DEVICE(obj));
-
-memory_region_add_subregion(m, addr, &tpmppi->ram);
  }
diff --git a/hw/tpm/tpm_tis_isa.c b/hw/tpm/tpm_tis_isa.c
index 91e3792248..7cd7415f30 100644
--- a/hw/tpm/tpm_tis_isa.c
+++ b/hw/tpm/tpm_tis_isa.c
@@ -134,8 +134,9 @@ static void tpm_tis_isa_realizefn(DeviceState *dev, Error 
**errp)
  TPM_TIS_ADDR_BASE, &s->mmio);

  if (s->ppi_enabled) {
-tpm_ppi_init(&s->ppi, isa_address_space(ISA_DEVICE(dev)),
- TPM_PPI_ADDR_BASE, OBJECT(dev));
+tpm_ppi_init_memory(&s->ppi, OBJECT(dev));
+memory_region_add_subregion(isa_address_space(ISA_DEVICE(dev)),
+TPM_PPI_ADDR_BASE, &s->ppi.ram);
  }
  }

Re: [PATCH 06/11] tpm_crb: move ACPI table building to device interface

2023-07-13 Thread Stefan Berger





On 7/12/23 23:51, Joelle van Dyne wrote:

This logic is similar to TPM TIS ISA device.

Signed-off-by: Joelle van Dyne 
---
  hw/i386/acpi-build.c | 23 ---
  hw/tpm/tpm_crb.c | 28 
  2 files changed, 28 insertions(+), 23 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 9c74fa17ad..b767df39df 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1441,9 +1441,6 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
  uint32_t nr_mem = machine->ram_slots;
  int root_bus_limit = 0xFF;
  PCIBus *bus = NULL;
-#ifdef CONFIG_TPM
-TPMIf *tpm = tpm_find();
-#endif
  bool cxl_present = false;
  int i;
  VMBusBridge *vmbus_bridge = vmbus_bridge_find();
@@ -1793,26 +1790,6 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
  }
  }

-#ifdef CONFIG_TPM
-if (TPM_IS_CRB(tpm)) {
-dev = aml_device("TPM");
-aml_append(dev, aml_name_decl("_HID", aml_string("MSFT0101")));
-aml_append(dev, aml_name_decl("_STR",
-  aml_string("TPM 2.0 Device")));
-crs = aml_resource_template();
-aml_append(crs, aml_memory32_fixed(TPM_CRB_ADDR_BASE,
-   TPM_CRB_ADDR_SIZE, AML_READ_WRITE));
-aml_append(dev, aml_name_decl("_CRS", crs));
-
-aml_append(dev, aml_name_decl("_STA", aml_int(0xf)));
-aml_append(dev, aml_name_decl("_UID", aml_int(1)));
-
-tpm_build_ppi_acpi(tpm, dev);
-
-aml_append(sb_scope, dev);
-}
-#endif
-
  if (pcms->sgx_epc.size != 0) {
  uint64_t epc_base = pcms->sgx_epc.base;
  uint64_t epc_size = pcms->sgx_epc.size;
diff --git a/hw/tpm/tpm_crb.c b/hw/tpm/tpm_crb.c
index 6144081d30..14feb9857f 100644
--- a/hw/tpm/tpm_crb.c
+++ b/hw/tpm/tpm_crb.c
@@ -19,6 +19,8 @@
  #include "qemu/module.h"
  #include "qapi/error.h"
  #include "exec/address-spaces.h"
+#include "hw/acpi/acpi_aml_interface.h"
+#include "hw/acpi/tpm.h"
  #include "hw/qdev-properties.h"
  #include "hw/pci/pci_ids.h"
  #include "hw/acpi/tpm.h"
@@ -116,10 +118,34 @@ static void tpm_crb_isa_realize(DeviceState *dev, Error 
**errp)
  }
  }

+static void build_tpm_crb_isa_aml(AcpiDevAmlIf *adev, Aml *scope)
+{
+Aml *dev, *crs;
+CRBState *s = CRB(adev);
+TPMIf *ti = TPM_IF(s);
+
+dev = aml_device("TPM");
+if (tpm_crb_isa_get_version(ti) == TPM_VERSION_2_0) {
+aml_append(dev, aml_name_decl("_HID", aml_string("MSFT0101")));
+aml_append(dev, aml_name_decl("_STR", aml_string("TPM 2.0 Device")));
+} else {
+aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0C31")));
+}


CRB only exists for TPM 2.0 and that's why we didn't have a different case here 
before.

CRB only has MSFT0101: 
https://elixir.bootlin.com/linux/latest/source/drivers/char/tpm/tpm_crb.c#L820
TIS has PNP0C31: 
https://elixir.bootlin.com/linux/latest/source/drivers/char/tpm/tpm_tis.c

You should remove the check for TPM_VERSION_2_0.

   Stefan

+aml_append(dev, aml_name_decl("_UID", aml_int(1)));
+aml_append(dev, aml_name_decl("_STA", aml_int(0xF)));
+crs = aml_resource_template();
+aml_append(crs, aml_memory32_fixed(TPM_CRB_ADDR_BASE, TPM_CRB_ADDR_SIZE,
+  AML_READ_WRITE));
+aml_append(dev, aml_name_decl("_CRS", crs));
+tpm_build_ppi_acpi(ti, dev);
+aml_append(scope, dev);
+}
+
  static void tpm_crb_isa_class_init(ObjectClass *klass, void *data)
  {
  DeviceClass *dc = DEVICE_CLASS(klass);
  TPMIfClass *tc = TPM_IF_CLASS(klass);
+AcpiDevAmlIfClass *adevc = ACPI_DEV_AML_IF_CLASS(klass);

  dc->realize = tpm_crb_isa_realize;
  device_class_set_props(dc, tpm_crb_isa_properties);
@@ -128,6 +154,7 @@ static void tpm_crb_isa_class_init(ObjectClass *klass, void 
*data)
  tc->model = TPM_MODEL_TPM_CRB;
  tc->get_version = tpm_crb_isa_get_version;
  tc->request_completed = tpm_crb_isa_request_completed;
+adevc->build_dev_aml = build_tpm_crb_isa_aml;

  set_bit(DEVICE_CATEGORY_MISC, dc->categories);
  }
@@ -139,6 +166,7 @@ static const TypeInfo tpm_crb_isa_info = {
  .class_init  = tpm_crb_isa_class_init,
  .interfaces = (InterfaceInfo[]) {
  { TYPE_TPM_IF },
+{ TYPE_ACPI_DEV_AML_IF },
  { }
  }
  };

Re: [PATCH] target/hexagon/idef-parser: Remove self-assignment

2023-07-13 Thread Philippe Mathieu-Daudé


On 13/7/23 14:08, Anton Johansson via wrote:

The self assignment is clearly useless, and @1.last_column does not have
to be set for an expression with only a single token, so remove it.

Reported-by: Peter Maydell 
Signed-off-by: Anton Johansson 
---
  target/hexagon/idef-parser/idef-parser.y | 1 -
  1 file changed, 1 deletion(-)


Reviewed-by: Philippe Mathieu-Daudé

RE: [PATCH] target/hexagon/idef-parser: Remove self-assignment

2023-07-13 Thread Brian Cain




> -Original Message-
> From: Anton Johansson 
> Sent: Thursday, July 13, 2023 7:09 AM
> To: qemu-devel@nongnu.org
> Cc: Brian Cain ; peter.mayd...@linaro.org
> Subject: [PATCH] target/hexagon/idef-parser: Remove self-assignment
> 
> WARNING: This email originated from outside of Qualcomm. Please be wary of
> any links or attachments, and do not enable macros.
> 
> The self assignment is clearly useless, and @1.last_column does not have
> to be set for an expression with only a single token, so remove it.
> 
> Reported-by: Peter Maydell 
> Signed-off-by: Anton Johansson 
> ---
>  target/hexagon/idef-parser/idef-parser.y | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/target/hexagon/idef-parser/idef-parser.y b/target/hexagon/idef-
> parser/idef-parser.y
> index cd2612eb8c..a6587f5bcc 100644
> --- a/target/hexagon/idef-parser/idef-parser.y
> +++ b/target/hexagon/idef-parser/idef-parser.y
> @@ -802,7 +802,6 @@ rvalue : FAIL
> 
>  lvalue : FAIL
>   {
> - @1.last_column = @1.last_column;
>   yyassert(c, &@1, false, "Encountered a FAIL token as 
> lvalue.\n");
>   }
> | REG
> --
> 2.41.0

Reviewed-by: Brian Cain

Re: [PATCH 0/3] hw/arm/virt: Use generic CPU invalidation

2023-07-13 Thread Philippe Mathieu-Daudé


On 13/7/23 14:34, Gavin Shan wrote:

Hi Peter and Marcin,

On 7/13/23 21:52, Marcin Juszkiewicz wrote:

W dniu 13.07.2023 o 13:44, Peter Maydell pisze:


I see this isn't a change in this patch, but given that
what the user specifies is not "cortex-a8-arm-cpu" but
"cortex-a8", why do we include the "-arm-cpu" suffix in
the error messages? It's not valid syntax to say
"-cpu cortex-a8-arm-cpu", so it's a bit misleading...


Internally those cpu names are "max-{TYPE_ARM_CPU}" and similar for 
other architectures.


I like the change but it (IMHO) needs to cut "-{TYPE_*_CPU}" string 
from names:


13:37 marcin@applejack:qemu$ 
./build/aarch64-softmmu/qemu-system-aarch64 -M virt -cpu cortex-r5

qemu-system-aarch64: Invalid CPU type: cortex-r5-arm-cpu
The valid types are: cortex-a7-arm-cpu, cortex-a15-arm-cpu, 
cortex-a35-arm-cpu, cortex-a55-arm-cpu, cortex-a72-arm-cpu, 
cortex-a76-arm-cpu, a64fx-arm-cpu, neoverse-n1-arm-cpu, 
neoverse-v1-arm-cpu, cortex-a53-arm-cpu, cortex-a57-arm-cpu, 
host-arm-cpu, max-arm-cpu


13:37 marcin@applejack:qemu$ 
./build/aarch64-softmmu/qemu-system-aarch64 -M virt -cpu 
cortex-a57-arm-cpu

qemu-system-aarch64: unable to find CPU model 'cortex-a57-arm-cpu'



The suffix of CPU types are provided in hw/arm/virt.c::valid_cpu_types 
in PATCH[2].
In the generic validation, the complete CPU type is used. The error 
message also

have complete CPU type there.


In some places (arm_cpu_list_entry, arm_cpu_add_definition) we use:

  g_strndup(typename, strlen(typename) - strlen("-" TYPE_ARM_CPU))

Maybe extract as a helper? cpu_typename_name()? :)

Re: [PATCH 09/11] tpm_tis_sysbus: fix crash when PPI is enabled

2023-07-13 Thread Stefan Berger





On 7/12/23 23:51, Joelle van Dyne wrote:

If 'ppi' property is set, then `tpm_ppi_reset` is called on reset
which SEGFAULTs because `tpmppi->buf` is not allocated.

Signed-off-by: Joelle van Dyne 
---
  hw/tpm/tpm_tis_sysbus.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/hw/tpm/tpm_tis_sysbus.c b/hw/tpm/tpm_tis_sysbus.c
index 45e63efd63..1014d5d993 100644
--- a/hw/tpm/tpm_tis_sysbus.c
+++ b/hw/tpm/tpm_tis_sysbus.c
@@ -124,6 +124,10 @@ static void tpm_tis_sysbus_realizefn(DeviceState *dev, 
Error **errp)
  error_setg(errp, "'tpmdev' property is required");
  return;
  }
+
+if (s->ppi_enabled) {
+sysbus_init_mmio(SYS_BUS_DEVICE(dev), &s->ppi.ram);
+}
  }



The tpm-tis-device doesn't work for x86_64 but for aarch64.


We have this here in this file:

DEFINE_PROP_BOOL("ppi", TPMStateSysBus, state.ppi_enabled, false),

I don't know whether ppi would work on aarch64. It needs firmware support like 
in edk2.
I think the best solution is to remove this DEFINE_PROP_BOOL() and if someone 
wants
to enable it they would have to add firmware support and test it before 
re-enabling it.

   Stefan


  static void tpm_tis_sysbus_class_init(ObjectClass *klass, void *data)

Re: [PATCH 04/11] tpm_crb: use a single read-as-mem/write-as-mmio mapping

2023-07-13 Thread Stefan Berger





On 7/13/23 11:55, Peter Maydell wrote:

On Thu, 13 Jul 2023 at 16:46, Stefan Berger  wrote:

On 7/13/23 11:34, Peter Maydell wrote:

On Thu, 13 Jul 2023 at 16:28, Stefan Berger  wrote:

On 7/13/23 10:50, Peter Maydell wrote:

I'm not a super-fan of hacking around the fact that LDP
to hardware registers isn't supported in specific device
models, though...


What does this mean for this effort here?


Usually we say "fix the guest to not try to access hardware
registers with silly load/store instruction types". The other
option would be "put in a large amount of effort to support
emulating those instructions in QEMU userspace when KVM/HVF/etc
trap and punt them to us". For the last decade or so we have
taken the first of these approaches :-)


Is Microsoft likely to react to use telling them "fix the guest"?


They have on occasion in the past, yes.

The other outstanding question here is if this TPM device
should be a sysbus one at all (i.e. not i2c), which might
render this part moot.


Does the aarch64 virt VM support an i2c bus? Would it support the aspeed i2c 
bus? Does Windows then accept this i2c bus? Maybe the faster answer comes via 
this device that Joelle presumably has working on AARCH64 Windows.


   Stefan


thanks
-- PMM

Re: [PATCH 04/11] tpm_crb: use a single read-as-mem/write-as-mmio mapping

2023-07-13 Thread Peter Maydell

On Thu, 13 Jul 2023 at 17:54, Stefan Berger  wrote:
>
>
>
> On 7/13/23 11:55, Peter Maydell wrote:
> > On Thu, 13 Jul 2023 at 16:46, Stefan Berger  wrote:
> >> On 7/13/23 11:34, Peter Maydell wrote:
> >>> On Thu, 13 Jul 2023 at 16:28, Stefan Berger  wrote:
>  On 7/13/23 10:50, Peter Maydell wrote:
> > I'm not a super-fan of hacking around the fact that LDP
> > to hardware registers isn't supported in specific device
> > models, though...
> 
>  What does this mean for this effort here?
> >>>
> >>> Usually we say "fix the guest to not try to access hardware
> >>> registers with silly load/store instruction types". The other
> >>> option would be "put in a large amount of effort to support
> >>> emulating those instructions in QEMU userspace when KVM/HVF/etc
> >>> trap and punt them to us". For the last decade or so we have
> >>> taken the first of these approaches :-)
> >>
> >> Is Microsoft likely to react to use telling them "fix the guest"?
> >
> > They have on occasion in the past, yes.
> >
> > The other outstanding question here is if this TPM device
> > should be a sysbus one at all (i.e. not i2c), which might
> > render this part moot.
>
> Does the aarch64 virt VM support an i2c bus? Would it support the aspeed i2c 
> bus? Does Windows then accept this i2c bus? Maybe the faster answer comes via 
> this device that Joelle presumably has working on AARCH64 Windows.

The aim is not "get Windows booting as fast as possible", though.
It's to end up with a QEMU virt board that (a) is maintainable
(b) is reasonably congruent with what real hardware does
(c) works in a way that will also work with what other
guest OSes are expecting.

I don't want to accept changes to the virt board that are
hard to live with in future, because changing virt in
non-backward compatible ways is painful.

thanks
-- PMM

Re: [PATCH 04/11] tpm_crb: use a single read-as-mem/write-as-mmio mapping

2023-07-13 Thread Stefan Berger





On 7/13/23 13:07, Peter Maydell wrote:

On Thu, 13 Jul 2023 at 17:54, Stefan Berger  wrote:




On 7/13/23 11:55, Peter Maydell wrote:

On Thu, 13 Jul 2023 at 16:46, Stefan Berger  wrote:

On 7/13/23 11:34, Peter Maydell wrote:

On Thu, 13 Jul 2023 at 16:28, Stefan Berger  wrote:

On 7/13/23 10:50, Peter Maydell wrote:

I'm not a super-fan of hacking around the fact that LDP
to hardware registers isn't supported in specific device
models, though...


What does this mean for this effort here?


Usually we say "fix the guest to not try to access hardware
registers with silly load/store instruction types". The other
option would be "put in a large amount of effort to support
emulating those instructions in QEMU userspace when KVM/HVF/etc
trap and punt them to us". For the last decade or so we have
taken the first of these approaches :-)


Is Microsoft likely to react to use telling them "fix the guest"?


They have on occasion in the past, yes.

The other outstanding question here is if this TPM device
should be a sysbus one at all (i.e. not i2c), which might
render this part moot.


Does the aarch64 virt VM support an i2c bus? Would it support the aspeed i2c 
bus? Does Windows then accept this i2c bus? Maybe the faster answer comes via 
this device that Joelle presumably has working on AARCH64 Windows.


The aim is not "get Windows booting as fast as possible", though.
It's to end up with a QEMU virt board that (a) is maintainable
(b) is reasonably congruent with what real hardware does
(c) works in a way that will also work with what other
guest OSes are expecting.

I don't want to accept changes to the virt board that are
hard to live with in future, because changing virt in
non-backward compatible ways is painful.



I guess the first point would be to decide whether to support an i2c bus on the 
virt board and then whether we can use the aspeed bus that we know that the 
tpm_tis_i2c device model works with but we don't know how Windows may react to 
it.

It seems sysbus is already supported there so ... we may have a 'match'?

dev = qdev_new("arm-gicv2m");
sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, vms->memmap[VIRT_GIC_V2M].base);
qdev_prop_set_uint32(dev, "base-spi", irq);
qdev_prop_set_uint32(dev, "num-spi", NUM_GICV2M_SPIS);
sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);


   Stefan


thanks
-- PMM

Re: [PATCH 04/11] tpm_crb: use a single read-as-mem/write-as-mmio mapping

2023-07-13 Thread Peter Maydell

On Thu, 13 Jul 2023 at 18:16, Stefan Berger  wrote:
> I guess the first point would be to decide whether to support an i2c bus on 
> the virt board and then whether we can use the aspeed bus that we know that 
> the tpm_tis_i2c device model works with but we don't know how Windows may 
> react to it.
>
> It seems sysbus is already supported there so ... we may have a 'match'?

You can use sysbus devices anywhere -- they're just
"this is a memory mapped device". The question is whether
we should, or whether an i2c controller is more like
what the real world uses (and if so, what i2c controller).

-- PMM

[PATCH] hw/tpm: TIS on sysbus: Remove unsupport ppi command line option

2023-07-13 Thread Stefan Berger

The ppi command line option for the TIS device on sysbus never worked
and caused an immediate segfault. Remove support for it since it also
needs support in the firmware and needs testing inside the VM.

Reproducer with the ppi=on option passed:

qemu-system-aarch64 \
   -machine virt,gic-version=3 \
   -m 4G  \
   -nographic -no-acpi \
   -chardev socket,id=chrtpm,path=/tmp/mytpm1/swtpm-sock \
   -tpmdev emulator,id=tpm0,chardev=chrtpm \
   -device tpm-tis-device,tpmdev=tpm0,ppi=on
[...]
Segmentation fault (core dumped)

Signed-off-by: Stefan Berger 
---
 hw/tpm/tpm_tis_sysbus.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/hw/tpm/tpm_tis_sysbus.c b/hw/tpm/tpm_tis_sysbus.c
index 45e63efd63..6724b3d4f6 100644
--- a/hw/tpm/tpm_tis_sysbus.c
+++ b/hw/tpm/tpm_tis_sysbus.c
@@ -93,7 +93,6 @@ static void tpm_tis_sysbus_reset(DeviceState *dev)
 static Property tpm_tis_sysbus_properties[] = {
 DEFINE_PROP_UINT32("irq", TPMStateSysBus, state.irq_num, TPM_TIS_IRQ),
 DEFINE_PROP_TPMBE("tpmdev", TPMStateSysBus, state.be_driver),
-DEFINE_PROP_BOOL("ppi", TPMStateSysBus, state.ppi_enabled, false),
 DEFINE_PROP_END_OF_LIST(),
 };
 
-- 
2.41.0

Re: [PATCH 00/11] tpm: introduce TPM CRB SysBus device

2023-07-13 Thread Joelle van Dyne

On Thu, Jul 13, 2023 at 6:07 AM Stefan Berger  wrote:
>
>
>
> On 7/12/23 23:51, Joelle van Dyne wrote:
> > The impetus for this patch set is to get TPM 2.0 working on Windows 11 
> > ARM64.
> > Windows' tpm.sys does not seem to work on a TPM TIS device (as verified with
> > VMWare's implementation). However, the current TPM CRB device uses a fixed
> > system bus address that is reserved for RAM in ARM64 Virt machines.
>
> Thanks a lot for this work. The last sentence seems to hint at the current 
> issue
> with TPM CRB on ARM64 and seems to be the only way forward there. You may want
> to reformulate it a bit because it's not clear how the 'however' related to
> CRB relates to TIS.
>
> >
> > In the process of adding the TPM CRB SysBus device, we also went ahead and
> > cleaned up some of the existing TPM hardware code and fixed some bugs. We 
> > used
>
> Please reorder bugs to the beginning of the series or submit in an extra 
> patch set
> so we can backport them. Ideal would be description(s) for how to trigger the 
> bug(s).
>
> > the TPM TIS devices as a template for the TPM CRB devices and refactored out
> > common code. We moved the ACPI DSDT generation to the device in order to 
> > handle
> > dynamic base address requirements as well as reduce redundent code in 
> > different
> s/redundent/redundant
>
>
> > machine ACPI generation. We also changed the tpm_crb device to use the ISA 
> > bus
> > instead of depending on the default system bus as the device only was built 
> > for
> > the PC configuration.
> >
> > Another change is that the TPM CRB registers are now mapped in the same way 
> > that
> > the pflash ROM devices are mapped. It is a memory region whose writes are
> > trapped as MMIO accesses. This was needed because Apple Silicon does not 
> > decode
> > LDP caused page faults. @agraf suggested that we do this to avoid having to
>
> Afaik, LDP is an ARM assembly instruction that loads two 32bit or 64bit 
> registers from
> consecutive addresses. May be worth mentioning for those wondering about it...
>
> > do AARCH64 decoding in the HVF fault handler.
>
> What is HVF?

Sorry, HVF is the QEMU backend for Apple's Hypervisor.framework which
runs on macOS including on Apple Silicon.

>
> Regards,
> Stefan
> >
> > Unfortunately, it seems like the LDP fault still happens on HVF but the 
> > issue
> > seems to be in the HVF backend which needs to be fixed in a separate patch.
> >
> > One last thing that's needed to get Windows 11 to recognize the TPM 2.0 
> > device
> > is for the OVMF firmware to setup the TPM device. Currently, OVMF for ARM64 
> > Virt
> > only recognizes the TPM TIS device through a FDT entry. A workaround is to
> > falsely identify the TPM CRB device as a TPM TIS device in the FDT node but 
> > this
> > causes issues for Linux. A proper fix would involve adding an ACPI device 
> > driver
> > in OVMF.
> >
> > Joelle van Dyne (11):
> >tpm_crb: refactor common code
> >tpm_crb: CTRL_RSP_ADDR is 64-bits wide
> >tpm_ppi: refactor memory space initialization
> >tpm_crb: use a single read-as-mem/write-as-mmio mapping
> >tpm_crb: use the ISA bus
> >tpm_crb: move ACPI table building to device interface
> >hw/arm/virt: add plug handler for TPM on SysBus
> >hw/loongarch/virt: add plug handler for TPM on SysBus
> >tpm_tis_sysbus: fix crash when PPI is enabled
> >tpm_tis_sysbus: move DSDT AML generation to device
> >tpm_crb_sysbus: introduce TPM CRB SysBus device
> >
> >   docs/specs/tpm.rst  |   2 +
> >   hw/tpm/tpm_crb.h|  74 +
> >   hw/tpm/tpm_ppi.h|  10 +-
> >   include/hw/acpi/aml-build.h |   1 +
> >   include/hw/acpi/tpm.h   |   3 +-
> >   include/sysemu/tpm.h|   3 +
> >   hw/acpi/aml-build.c |   7 +-
> >   hw/arm/virt-acpi-build.c|  38 +
> >   hw/arm/virt.c   |  38 +
> >   hw/core/sysbus-fdt.c|   1 +
> >   hw/i386/acpi-build.c|  23 ---
> >   hw/loongarch/acpi-build.c   |  38 +
> >   hw/loongarch/virt.c |  38 +
> >   hw/riscv/virt.c |   1 +
> >   hw/tpm/tpm_crb.c| 307 
> >   hw/tpm/tpm_crb_common.c | 224 ++
> >   hw/tpm/tpm_crb_sysbus.c | 178 +
> >   hw/tpm/tpm_ppi.c|   5 +-
> >   hw/tpm/tpm_tis_isa.c|   5 +-
> >   hw/tpm/tpm_tis_sysbus.c |  43 +
> >   tests/qtest/tpm-crb-test.c  |   2 +-
> >   tests/qtest/tpm-util.c  |   2 +-
> >   hw/arm/Kconfig  |   1 +
> >   hw/riscv/Kconfig|   1 +
> >   hw/tpm/Kconfig  |   7 +-
> >   hw/tpm/meson.build  |   3 +
> >   hw/tpm/trace-events |   2 +-
> >   27 files changed, 703 insertions(+), 354 deletions(-)
> >   create mode 100644 hw/tpm/tpm_crb.h
> >   create mode 100644 hw/tpm/tpm_crb_common.c
> >   create mode 100644 hw/tpm/tpm_crb_sysbus.c
> >

Re: [PATCH for-8.1 1/3] target/arm/ptw.c: Add comments to S1Translate struct fields

2023-07-13 Thread Richard Henderson


On 7/10/23 16:21, Peter Maydell wrote:

Add comments to the in_* fields in the S1Translate struct
that explain what they're doing.

Signed-off-by: Peter Maydell
---
I figured some of this out when writing commit fcc0b0418fff,
and then I found I'd forgotten it all when I was trying
to fix this new bug. So this time I'm writing this down :-)
---
  target/arm/ptw.c | 40 
  1 file changed, 40 insertions(+)


Reviewed-by: Richard Henderson 

r~

Re: [PATCH] hw/tpm: TIS on sysbus: Remove unsupport ppi command line option

2023-07-13 Thread Eric Auger

Hi Stefan,

On 7/13/23 19:19, Stefan Berger wrote:
> The ppi command line option for the TIS device on sysbus never worked
> and caused an immediate segfault. Remove support for it since it also
> needs support in the firmware and needs testing inside the VM.
>
> Reproducer with the ppi=on option passed:
>
> qemu-system-aarch64 \
>-machine virt,gic-version=3 \
>-m 4G  \
>-nographic -no-acpi \
>-chardev socket,id=chrtpm,path=/tmp/mytpm1/swtpm-sock \
>-tpmdev emulator,id=tpm0,chardev=chrtpm \
>-device tpm-tis-device,tpmdev=tpm0,ppi=on
> [...]
> Segmentation fault (core dumped)
>
> Signed-off-by: Stefan Berger 
Reviewed-by: Eric Auger 

Thanks!

Eric
> ---
>  hw/tpm/tpm_tis_sysbus.c | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/hw/tpm/tpm_tis_sysbus.c b/hw/tpm/tpm_tis_sysbus.c
> index 45e63efd63..6724b3d4f6 100644
> --- a/hw/tpm/tpm_tis_sysbus.c
> +++ b/hw/tpm/tpm_tis_sysbus.c
> @@ -93,7 +93,6 @@ static void tpm_tis_sysbus_reset(DeviceState *dev)
>  static Property tpm_tis_sysbus_properties[] = {
>  DEFINE_PROP_UINT32("irq", TPMStateSysBus, state.irq_num, TPM_TIS_IRQ),
>  DEFINE_PROP_TPMBE("tpmdev", TPMStateSysBus, state.be_driver),
> -DEFINE_PROP_BOOL("ppi", TPMStateSysBus, state.ppi_enabled, false),
>  DEFINE_PROP_END_OF_LIST(),
>  };
>

Re: [PATCH 07/11] hw/arm/virt: add plug handler for TPM on SysBus

2023-07-13 Thread Joelle van Dyne

On Thu, Jul 13, 2023 at 8:31 AM Peter Maydell  wrote:
>
> On Thu, 13 Jul 2023 at 04:52, Joelle van Dyne  wrote:
> >
> > TPM needs to know its own base address in order to generate its DSDT
> > device entry.
> >
> > Signed-off-by: Joelle van Dyne 
> > ---
> >  hw/arm/virt.c | 37 +
> >  1 file changed, 37 insertions(+)
> >
> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index 7d9dbc2663..432148ef47 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -2732,6 +2732,37 @@ static void virt_memory_plug(HotplugHandler 
> > *hotplug_dev,
> >   dev, &error_abort);
> >  }
> >
> > +#ifdef CONFIG_TPM
> > +static void virt_tpm_plug(VirtMachineState *vms, TPMIf *tpmif)
> > +{
> > +PlatformBusDevice *pbus = PLATFORM_BUS_DEVICE(vms->platform_bus_dev);
> > +hwaddr pbus_base = vms->memmap[VIRT_PLATFORM_BUS].base;
> > +SysBusDevice *sbdev = SYS_BUS_DEVICE(tpmif);
> > +MemoryRegion *sbdev_mr;
> > +hwaddr tpm_base;
> > +uint64_t tpm_size;
> > +
> > +if (!sbdev || !object_dynamic_cast(OBJECT(sbdev), 
> > TYPE_SYS_BUS_DEVICE)) {
> > +return;
> > +}
> > +
> > +tpm_base = platform_bus_get_mmio_addr(pbus, sbdev, 0);
> > +assert(tpm_base != -1);
> > +
> > +tpm_base += pbus_base;
> > +
> > +sbdev_mr = sysbus_mmio_get_region(sbdev, 0);
> > +tpm_size = memory_region_size(sbdev_mr);
> > +
> > +if (object_property_find(OBJECT(sbdev), "baseaddr")) {
> > +object_property_set_uint(OBJECT(sbdev), "baseaddr", tpm_base, 
> > NULL);
> > +}
> > +if (object_property_find(OBJECT(sbdev), "size")) {
> > +object_property_set_uint(OBJECT(sbdev), "size", tpm_size, NULL);
> > +}
> > +}
> > +#endif
>
> I do not like the "platform bus" at all -- it is a nasty hack.
> If the virt board needs a memory mapped TPM device it should probably
> just create one, the same way we create our other memory mapped
> devices. But...
>
> How are TPM devices typically set up/visible to the guest on
> real Arm server hardware ? Should this be a sysbus device at all?

+Alexander Graf who may answer this better.

My understanding is that we need to do this for the device to know its
own address which it needs to return in a register. On ISA devices, it
is always mapped to the same physical address so there's no issues but
for Virt machines, device addresses are dynamically allocated by the
PlatformBusDevice so only at this late stage can we tell the device
what its own address is.

>
> thanks
> -- PMM

Also to Stefan's question on consolidating code: that is ideal but
currently, it seems like much platform setup code is duplicated
amongst the various architecture's Virt machines. There would have to
be a larger effort in de-duplicating a lot of that code. Indeed, we
try to do this already with some of the ACPI stuff in the other
patches. For this specifically, we would need to know the platform
bus' base address which is done differently in ARM64's Virt and in
Loongarch's Virt. All we did was delete some existing duplicated code
and replace it with a different duplicated code :)

Re: [PATCH 06/11] tpm_crb: move ACPI table building to device interface

2023-07-13 Thread Joelle van Dyne

In that case, do you think we should have a check in "realize" to make
sure the backend is 2.0?

On Thu, Jul 13, 2023 at 9:08 AM Stefan Berger  wrote:
>
>
>
> On 7/12/23 23:51, Joelle van Dyne wrote:
> > This logic is similar to TPM TIS ISA device.
> >
> > Signed-off-by: Joelle van Dyne 
> > ---
> >   hw/i386/acpi-build.c | 23 ---
> >   hw/tpm/tpm_crb.c | 28 
> >   2 files changed, 28 insertions(+), 23 deletions(-)
> >
> > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> > index 9c74fa17ad..b767df39df 100644
> > --- a/hw/i386/acpi-build.c
> > +++ b/hw/i386/acpi-build.c
> > @@ -1441,9 +1441,6 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> >   uint32_t nr_mem = machine->ram_slots;
> >   int root_bus_limit = 0xFF;
> >   PCIBus *bus = NULL;
> > -#ifdef CONFIG_TPM
> > -TPMIf *tpm = tpm_find();
> > -#endif
> >   bool cxl_present = false;
> >   int i;
> >   VMBusBridge *vmbus_bridge = vmbus_bridge_find();
> > @@ -1793,26 +1790,6 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> >   }
> >   }
> >
> > -#ifdef CONFIG_TPM
> > -if (TPM_IS_CRB(tpm)) {
> > -dev = aml_device("TPM");
> > -aml_append(dev, aml_name_decl("_HID", aml_string("MSFT0101")));
> > -aml_append(dev, aml_name_decl("_STR",
> > -  aml_string("TPM 2.0 Device")));
> > -crs = aml_resource_template();
> > -aml_append(crs, aml_memory32_fixed(TPM_CRB_ADDR_BASE,
> > -   TPM_CRB_ADDR_SIZE, 
> > AML_READ_WRITE));
> > -aml_append(dev, aml_name_decl("_CRS", crs));
> > -
> > -aml_append(dev, aml_name_decl("_STA", aml_int(0xf)));
> > -aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> > -
> > -tpm_build_ppi_acpi(tpm, dev);
> > -
> > -aml_append(sb_scope, dev);
> > -}
> > -#endif
> > -
> >   if (pcms->sgx_epc.size != 0) {
> >   uint64_t epc_base = pcms->sgx_epc.base;
> >   uint64_t epc_size = pcms->sgx_epc.size;
> > diff --git a/hw/tpm/tpm_crb.c b/hw/tpm/tpm_crb.c
> > index 6144081d30..14feb9857f 100644
> > --- a/hw/tpm/tpm_crb.c
> > +++ b/hw/tpm/tpm_crb.c
> > @@ -19,6 +19,8 @@
> >   #include "qemu/module.h"
> >   #include "qapi/error.h"
> >   #include "exec/address-spaces.h"
> > +#include "hw/acpi/acpi_aml_interface.h"
> > +#include "hw/acpi/tpm.h"
> >   #include "hw/qdev-properties.h"
> >   #include "hw/pci/pci_ids.h"
> >   #include "hw/acpi/tpm.h"
> > @@ -116,10 +118,34 @@ static void tpm_crb_isa_realize(DeviceState *dev, 
> > Error **errp)
> >   }
> >   }
> >
> > +static void build_tpm_crb_isa_aml(AcpiDevAmlIf *adev, Aml *scope)
> > +{
> > +Aml *dev, *crs;
> > +CRBState *s = CRB(adev);
> > +TPMIf *ti = TPM_IF(s);
> > +
> > +dev = aml_device("TPM");
> > +if (tpm_crb_isa_get_version(ti) == TPM_VERSION_2_0) {
> > +aml_append(dev, aml_name_decl("_HID", aml_string("MSFT0101")));
> > +aml_append(dev, aml_name_decl("_STR", aml_string("TPM 2.0 
> > Device")));
> > +} else {
> > +aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0C31")));
> > +}
>
> CRB only exists for TPM 2.0 and that's why we didn't have a different case 
> here before.
>
> CRB only has MSFT0101: 
> https://elixir.bootlin.com/linux/latest/source/drivers/char/tpm/tpm_crb.c#L820
> TIS has PNP0C31: 
> https://elixir.bootlin.com/linux/latest/source/drivers/char/tpm/tpm_tis.c
>
> You should remove the check for TPM_VERSION_2_0.
>
> Stefan
> > +aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> > +aml_append(dev, aml_name_decl("_STA", aml_int(0xF)));
> > +crs = aml_resource_template();
> > +aml_append(crs, aml_memory32_fixed(TPM_CRB_ADDR_BASE, 
> > TPM_CRB_ADDR_SIZE,
> > +  AML_READ_WRITE));
> > +aml_append(dev, aml_name_decl("_CRS", crs));
> > +tpm_build_ppi_acpi(ti, dev);
> > +aml_append(scope, dev);
> > +}
> > +
> >   static void tpm_crb_isa_class_init(ObjectClass *klass, void *data)
> >   {
> >   DeviceClass *dc = DEVICE_CLASS(klass);
> >   TPMIfClass *tc = TPM_IF_CLASS(klass);
> > +AcpiDevAmlIfClass *adevc = ACPI_DEV_AML_IF_CLASS(klass);
> >
> >   dc->realize = tpm_crb_isa_realize;
> >   device_class_set_props(dc, tpm_crb_isa_properties);
> > @@ -128,6 +154,7 @@ static void tpm_crb_isa_class_init(ObjectClass *klass, 
> > void *data)
> >   tc->model = TPM_MODEL_TPM_CRB;
> >   tc->get_version = tpm_crb_isa_get_version;
> >   tc->request_completed = tpm_crb_isa_request_completed;
> > +adevc->build_dev_aml = build_tpm_crb_isa_aml;
> >
> >   set_bit(DEVICE_CATEGORY_MISC, dc->categories);
> >   }
> > @@ -139,6 +166,7 @@ static const TypeInfo tpm_crb_isa_info = {
> >   .class_init  = tpm_crb_isa_class_init,
> >   .interfaces = (InterfaceInfo[]) {
> >   { TYPE_TPM_IF },
> > +{ TYPE_ACPI_DEV_AML_IF },
> >   { }
> >

Re: [PATCH 09/11] tpm_tis_sysbus: fix crash when PPI is enabled

2023-07-13 Thread Joelle van Dyne

On Thu, Jul 13, 2023 at 9:49 AM Stefan Berger  wrote:
>
>
> The tpm-tis-device doesn't work for x86_64 but for aarch64.
>
>
> We have this here in this file:
>
>  DEFINE_PROP_BOOL("ppi", TPMStateSysBus, state.ppi_enabled, false),
>
> I don't know whether ppi would work on aarch64. It needs firmware support 
> like in edk2.
> I think the best solution is to remove this DEFINE_PROP_BOOL() and if someone 
> wants
> to enable it they would have to add firmware support and test it before 
> re-enabling it.
>
> Stefan
>
> >   static void tpm_tis_sysbus_class_init(ObjectClass *klass, void *data)

Yeah, I'm not sure if PPI works with AARCH64 since I didn't bother to
change it to not use hard coded addresses. However, isn't that "ppi"
overridable from the command line? If so, should we add a check in
"realize" to error if PPI=true? Otherwise, it will just crash.

Re: [PATCH for-8.1 2/3] target/arm: Fix S1_ptw_translate() debug path

2023-07-13 Thread Richard Henderson


On 7/10/23 16:21, Peter Maydell wrote:

In commit XXX we rearranged the logic in S1_ptw_translate() so that
the debug-access "call get_phys_addr_*" codepath is used both when S1
is doing ptw reads from stage 2 and when it is doing ptw reads from
physical memory.  However, we didn't update the calculation of
s2ptw->in_space and s2ptw->in_secure to account for the "ptw reads
from physical memory" case.  This meant that debug accesses when in
Secure state broke.

Create a new function S2_security_space() which returns the
correct security space to use for the ptw load, and use it to
determine the correct .in_secure and .in_space fields for the
stage 2 lookup for the ptw load.

Reported-by: Jean-Philippe Brucker
Fixes: fe4a5472ccd6 ("target/arm: Use get_phys_addr_with_struct in 
S1_ptw_translate")
Signed-off-by: Peter Maydell
---
  target/arm/ptw.c | 37 -
  1 file changed, 32 insertions(+), 5 deletions(-)


Reviewed-by: Richard Henderson 

r~

Re: [PATCH 06/11] tpm_crb: move ACPI table building to device interface

2023-07-13 Thread Stefan Berger





On 7/13/23 14:10, Joelle van Dyne wrote:

In that case, do you think we should have a check in "realize" to make
sure the backend is 2.0?


Maybe. I think at the moment it would simply not work (with existing drivers) 
without terminating QEMU on it due to the misconfiguration. On libvirt level we 
intercept this case and notify the user that the combination doesn't work. 
Leaving it like this would be an option...

   Stefan


On Thu, Jul 13, 2023 at 9:08 AM Stefan Berger  wrote:




On 7/12/23 23:51, Joelle van Dyne wrote:

This logic is similar to TPM TIS ISA device.

Signed-off-by: Joelle van Dyne 
---
   hw/i386/acpi-build.c | 23 ---
   hw/tpm/tpm_crb.c | 28 
   2 files changed, 28 insertions(+), 23 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 9c74fa17ad..b767df39df 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1441,9 +1441,6 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
   uint32_t nr_mem = machine->ram_slots;
   int root_bus_limit = 0xFF;
   PCIBus *bus = NULL;
-#ifdef CONFIG_TPM
-TPMIf *tpm = tpm_find();
-#endif
   bool cxl_present = false;
   int i;
   VMBusBridge *vmbus_bridge = vmbus_bridge_find();
@@ -1793,26 +1790,6 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
   }
   }

-#ifdef CONFIG_TPM
-if (TPM_IS_CRB(tpm)) {
-dev = aml_device("TPM");
-aml_append(dev, aml_name_decl("_HID", aml_string("MSFT0101")));
-aml_append(dev, aml_name_decl("_STR",
-  aml_string("TPM 2.0 Device")));
-crs = aml_resource_template();
-aml_append(crs, aml_memory32_fixed(TPM_CRB_ADDR_BASE,
-   TPM_CRB_ADDR_SIZE, AML_READ_WRITE));
-aml_append(dev, aml_name_decl("_CRS", crs));
-
-aml_append(dev, aml_name_decl("_STA", aml_int(0xf)));
-aml_append(dev, aml_name_decl("_UID", aml_int(1)));
-
-tpm_build_ppi_acpi(tpm, dev);
-
-aml_append(sb_scope, dev);
-}
-#endif
-
   if (pcms->sgx_epc.size != 0) {
   uint64_t epc_base = pcms->sgx_epc.base;
   uint64_t epc_size = pcms->sgx_epc.size;
diff --git a/hw/tpm/tpm_crb.c b/hw/tpm/tpm_crb.c
index 6144081d30..14feb9857f 100644
--- a/hw/tpm/tpm_crb.c
+++ b/hw/tpm/tpm_crb.c
@@ -19,6 +19,8 @@
   #include "qemu/module.h"
   #include "qapi/error.h"
   #include "exec/address-spaces.h"
+#include "hw/acpi/acpi_aml_interface.h"
+#include "hw/acpi/tpm.h"
   #include "hw/qdev-properties.h"
   #include "hw/pci/pci_ids.h"
   #include "hw/acpi/tpm.h"
@@ -116,10 +118,34 @@ static void tpm_crb_isa_realize(DeviceState *dev, Error 
**errp)
   }
   }

+static void build_tpm_crb_isa_aml(AcpiDevAmlIf *adev, Aml *scope)
+{
+Aml *dev, *crs;
+CRBState *s = CRB(adev);
+TPMIf *ti = TPM_IF(s);
+
+dev = aml_device("TPM");
+if (tpm_crb_isa_get_version(ti) == TPM_VERSION_2_0) {
+aml_append(dev, aml_name_decl("_HID", aml_string("MSFT0101")));
+aml_append(dev, aml_name_decl("_STR", aml_string("TPM 2.0 Device")));
+} else {
+aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0C31")));
+}


CRB only exists for TPM 2.0 and that's why we didn't have a different case here 
before.

CRB only has MSFT0101: 
https://elixir.bootlin.com/linux/latest/source/drivers/char/tpm/tpm_crb.c#L820
TIS has PNP0C31: 
https://elixir.bootlin.com/linux/latest/source/drivers/char/tpm/tpm_tis.c

You should remove the check for TPM_VERSION_2_0.

 Stefan

+aml_append(dev, aml_name_decl("_UID", aml_int(1)));
+aml_append(dev, aml_name_decl("_STA", aml_int(0xF)));
+crs = aml_resource_template();
+aml_append(crs, aml_memory32_fixed(TPM_CRB_ADDR_BASE, TPM_CRB_ADDR_SIZE,
+  AML_READ_WRITE));
+aml_append(dev, aml_name_decl("_CRS", crs));
+tpm_build_ppi_acpi(ti, dev);
+aml_append(scope, dev);
+}
+
   static void tpm_crb_isa_class_init(ObjectClass *klass, void *data)
   {
   DeviceClass *dc = DEVICE_CLASS(klass);
   TPMIfClass *tc = TPM_IF_CLASS(klass);
+AcpiDevAmlIfClass *adevc = ACPI_DEV_AML_IF_CLASS(klass);

   dc->realize = tpm_crb_isa_realize;
   device_class_set_props(dc, tpm_crb_isa_properties);
@@ -128,6 +154,7 @@ static void tpm_crb_isa_class_init(ObjectClass *klass, void 
*data)
   tc->model = TPM_MODEL_TPM_CRB;
   tc->get_version = tpm_crb_isa_get_version;
   tc->request_completed = tpm_crb_isa_request_completed;
+adevc->build_dev_aml = build_tpm_crb_isa_aml;

   set_bit(DEVICE_CATEGORY_MISC, dc->categories);
   }
@@ -139,6 +166,7 @@ static const TypeInfo tpm_crb_isa_info = {
   .class_init  = tpm_crb_isa_class_init,
   .interfaces = (InterfaceInfo[]) {
   { TYPE_TPM_IF },
+{ TYPE_ACPI_DEV_AML_IF },
   { }
   }
   };

Re: [PATCH for-8.1 3/3] target/arm/ptw.c: Account for FEAT_RME when applying {N}SW,SA bits

2023-07-13 Thread Richard Henderson


On 7/10/23 16:21, Peter Maydell wrote:

In get_phys_addr_twostage() the code that applies the effects of
VSTCR.{SA,SW} and VTCR.{NSA,NSW} only updates result->f.attrs.secure.
Now we also have f.attrs.space for FEAT_RME, we need to keep the two
in sync.

These bits only have an effect for Secure space translations, not
for Root, so use the input in_space field to determine whether to
apply them rather than the input is_secure. This doesn't actually
make a difference because Root translations are never two-stage,
but it's a little clearer.

Signed-off-by: Peter Maydell
---
I noticed this while reading through the ptw code...
---
  target/arm/ptw.c | 13 -
  1 file changed, 8 insertions(+), 5 deletions(-)


Reviewed-by: Richard Henderson 

r~

Re: [PATCH 09/11] tpm_tis_sysbus: fix crash when PPI is enabled

2023-07-13 Thread Stefan Berger





On 7/13/23 14:15, Joelle van Dyne wrote:

On Thu, Jul 13, 2023 at 9:49 AM Stefan Berger  wrote:



The tpm-tis-device doesn't work for x86_64 but for aarch64.


We have this here in this file:

  DEFINE_PROP_BOOL("ppi", TPMStateSysBus, state.ppi_enabled, false),

I don't know whether ppi would work on aarch64. It needs firmware support like 
in edk2.
I think the best solution is to remove this DEFINE_PROP_BOOL() and if someone 
wants
to enable it they would have to add firmware support and test it before 
re-enabling it.

 Stefan


   static void tpm_tis_sysbus_class_init(ObjectClass *klass, void *data)


Yeah, I'm not sure if PPI works with AARCH64 since I didn't bother to
change it to not use hard coded addresses. However, isn't that "ppi"
overridable from the command line? If so, should we add a check in
"realize" to error if PPI=true? Otherwise, it will just crash.


Once the option is removed via my patch (cc'ed you), then you get this once you 
pass ppi=on on the command line:

qemu-system-aarch64: -device tpm-tis-device,tpmdev=tpm0,ppi=on: Property 
'tpm-tis-device.ppi' not found

This disables it for good.

   Stefan

Re: [PATCH 05/11] tpm_crb: use the ISA bus

2023-07-13 Thread Stefan Berger





On 7/12/23 23:51, Joelle van Dyne wrote:

Since this device is gated to only build for targets with the PC
configuration, we should use the ISA bus like with TPM TIS.

Signed-off-by: Joelle van Dyne 


I think this patch is good but I'd like to try it with resuming and old VM 
snapshot and for this to work we need 04/11 to have the registers in the VM 
state.


   Stefan

---
  hw/tpm/tpm_crb.c | 52 
  hw/tpm/Kconfig   |  2 +-
  2 files changed, 27 insertions(+), 27 deletions(-)

diff --git a/hw/tpm/tpm_crb.c b/hw/tpm/tpm_crb.c
index 07c6868d8d..6144081d30 100644
--- a/hw/tpm/tpm_crb.c
+++ b/hw/tpm/tpm_crb.c
@@ -22,6 +22,7 @@
  #include "hw/qdev-properties.h"
  #include "hw/pci/pci_ids.h"
  #include "hw/acpi/tpm.h"
+#include "hw/isa/isa.h"
  #include "migration/vmstate.h"
  #include "sysemu/tpm_backend.h"
  #include "sysemu/tpm_util.h"
@@ -34,7 +35,7 @@
  #include "tpm_crb.h"

  struct CRBState {
-DeviceState parent_obj;
+ISADevice parent_obj;

  TPMCRBState state;
  };
@@ -43,49 +44,49 @@ typedef struct CRBState CRBState;
  DECLARE_INSTANCE_CHECKER(CRBState, CRB,
   TYPE_TPM_CRB)

-static void tpm_crb_none_request_completed(TPMIf *ti, int ret)
+static void tpm_crb_isa_request_completed(TPMIf *ti, int ret)
  {
  CRBState *s = CRB(ti);

  tpm_crb_request_completed(&s->state, ret);
  }

-static enum TPMVersion tpm_crb_none_get_version(TPMIf *ti)
+static enum TPMVersion tpm_crb_isa_get_version(TPMIf *ti)
  {
  CRBState *s = CRB(ti);

  return tpm_crb_get_version(&s->state);
  }

-static int tpm_crb_none_pre_save(void *opaque)
+static int tpm_crb_isa_pre_save(void *opaque)
  {
  CRBState *s = opaque;

  return tpm_crb_pre_save(&s->state);
  }

-static const VMStateDescription vmstate_tpm_crb_none = {
+static const VMStateDescription vmstate_tpm_crb_isa = {
  .name = "tpm-crb",
-.pre_save = tpm_crb_none_pre_save,
+.pre_save = tpm_crb_isa_pre_save,
  .fields = (VMStateField[]) {
  VMSTATE_END_OF_LIST(),
  }
  };

-static Property tpm_crb_none_properties[] = {
+static Property tpm_crb_isa_properties[] = {
  DEFINE_PROP_TPMBE("tpmdev", CRBState, state.tpmbe),
  DEFINE_PROP_BOOL("ppi", CRBState, state.ppi_enabled, true),
  DEFINE_PROP_END_OF_LIST(),
  };

-static void tpm_crb_none_reset(void *dev)
+static void tpm_crb_isa_reset(void *dev)
  {
  CRBState *s = CRB(dev);

  return tpm_crb_reset(&s->state, TPM_CRB_ADDR_BASE);
  }

-static void tpm_crb_none_realize(DeviceState *dev, Error **errp)
+static void tpm_crb_isa_realize(DeviceState *dev, Error **errp)
  {
  CRBState *s = CRB(dev);

@@ -100,52 +101,51 @@ static void tpm_crb_none_realize(DeviceState *dev, Error 
**errp)

  tpm_crb_init_memory(OBJECT(s), &s->state, errp);

-memory_region_add_subregion(get_system_memory(),
+memory_region_add_subregion(isa_address_space(ISA_DEVICE(dev)),
  TPM_CRB_ADDR_BASE, &s->state.mmio);

  if (s->state.ppi_enabled) {
-memory_region_add_subregion(get_system_memory(),
+memory_region_add_subregion(isa_address_space(ISA_DEVICE(dev)),
  TPM_PPI_ADDR_BASE, &s->state.ppi.ram);
  }

  if (xen_enabled()) {
-tpm_crb_none_reset(dev);
+tpm_crb_isa_reset(dev);
  } else {
-qemu_register_reset(tpm_crb_none_reset, dev);
+qemu_register_reset(tpm_crb_isa_reset, dev);
  }
  }

-static void tpm_crb_none_class_init(ObjectClass *klass, void *data)
+static void tpm_crb_isa_class_init(ObjectClass *klass, void *data)
  {
  DeviceClass *dc = DEVICE_CLASS(klass);
  TPMIfClass *tc = TPM_IF_CLASS(klass);

-dc->realize = tpm_crb_none_realize;
-device_class_set_props(dc, tpm_crb_none_properties);
-dc->vmsd  = &vmstate_tpm_crb_none;
+dc->realize = tpm_crb_isa_realize;
+device_class_set_props(dc, tpm_crb_isa_properties);
+dc->vmsd  = &vmstate_tpm_crb_isa;
  dc->user_creatable = true;
  tc->model = TPM_MODEL_TPM_CRB;
-tc->get_version = tpm_crb_none_get_version;
-tc->request_completed = tpm_crb_none_request_completed;
+tc->get_version = tpm_crb_isa_get_version;
+tc->request_completed = tpm_crb_isa_request_completed;

  set_bit(DEVICE_CATEGORY_MISC, dc->categories);
  }

-static const TypeInfo tpm_crb_none_info = {
+static const TypeInfo tpm_crb_isa_info = {
  .name = TYPE_TPM_CRB,
-/* could be TYPE_SYS_BUS_DEVICE (or LPC etc) */
-.parent = TYPE_DEVICE,
+.parent = TYPE_ISA_DEVICE,
  .instance_size = sizeof(CRBState),
-.class_init  = tpm_crb_none_class_init,
+.class_init  = tpm_crb_isa_class_init,
  .interfaces = (InterfaceInfo[]) {
  { TYPE_TPM_IF },
  { }
  }
  };

-static void tpm_crb_none_register(void)
+static void tpm_crb_isa_register(void)
  {
-type_register_static(&tpm_crb_none_info);
+type_register_static(&tpm_crb_isa_info);
  }

-type_init(tpm_crb_none_register)
+type_init(tpm_

Re: [PATCH 04/11] tpm_crb: use a single read-as-mem/write-as-mmio mapping

2023-07-13 Thread Stefan Berger





On 7/13/23 13:18, Peter Maydell wrote:

On Thu, 13 Jul 2023 at 18:16, Stefan Berger  wrote:

I guess the first point would be to decide whether to support an i2c bus on the 
virt board and then whether we can use the aspeed bus that we know that the 
tpm_tis_i2c device model works with but we don't know how Windows may react to 
it.

It seems sysbus is already supported there so ... we may have a 'match'?


You can use sysbus devices anywhere -- they're just


'anywhere' also includes aarch64 virt board I suppose.


"this is a memory mapped device". The question is whether
we should, or whether an i2c controller is more like
what the real world uses (and if so, what i2c controller).




I don't want to accept changes to the virt board that are
hard to live with in future, because changing virt in
non-backward compatible ways is painful.


Once we have the CRB sysbus device we would keep it around forever and it seems 
to
- not require any changes to the virt board (iiuc) since sysbus is already 
being used
- works already with Windows and probably also Linux


   Stefan


-- PMM

[RFC, PATCH, trivial, sample] treewide: spelling fixes in comments and some strings

2023-07-13 Thread Michael Tokarev

I got annoyed enough by various misspellings, and tried to clean up this
a bit.  And got this in the result, for now:

https://gitlab.com/mjt0k/qemu/-/commit/eb5a376c7282e63d9e11eb952046b01f1a5ae7d4

Below is a diffstat plus a few actual changes as a sample.

It fixes misspellings in comments and in some strings (mostly in
error messages). There are a few misspelt variable names which are
worth fixing too - this patch does not change variable names.

Now, given the size of the patch, I wonder how to address it in
a best way. Hence the RFC.  It took quite a lot of work to went
there, there's no automatic tools involved - all changes are
done after manual inspection of the context (with some help
from codespell).  So I'd love to avoid grouping it by maintainer, -
maximum I can do is to group by source directory.

Or maybe there's some other way to attack this?

Thanks,

/mjt

---
 accel/tcg/tb-maint.c  |  2 +-
 audio/mixeng.h|  2 +-
 backends/tpm/tpm_ioctl.h  |  2 +-
 block.c   |  2 +-
 block/block-copy.c|  4 +--
 block/export/vduse-blk.c  |  2 +-
 block/export/vhost-user-blk-server.c  |  2 +-
 block/export/vhost-user-blk-server.h  |  2 +-
 block/file-posix.c|  8 +++---
 block/graph-lock.c|  2 +-
 block/io.c|  2 +-
 block/linux-aio.c |  2 +-
 block/mirror.c|  2 +-
 block/qcow2-refcount.c|  2 +-
 block/vhdx.c  |  2 +-
 block/vhdx.h  |  4 +--
 bsd-user/errno_defs.h |  2 +-
 bsd-user/freebsd/target_os_siginfo.h  |  2 +-
 bsd-user/freebsd/target_os_stack.h|  4 +--
 bsd-user/freebsd/target_os_user.h |  2 +-
 bsd-user/qemu.h   |  2 +-
 bsd-user/signal-common.h  |  4 +--
 bsd-user/signal.c |  6 ++---
 chardev/char-socket.c |  6 ++---
 chardev/char.c|  2 +-
 contrib/plugins/cache.c   |  2 +-
 contrib/plugins/lockstep.c|  2 +-
 crypto/afalg.c|  2 +-
 crypto/block-luks.c   |  6 ++---
 crypto/der.c  |  2 +-
 crypto/der.h  |  6 ++---
 docs/about/deprecated.rst |  2 +-
 docs/devel/qapi-code-gen.rst  |  2 +-
 docs/devel/qom.rst|  2 +-
 docs/system/arm/palm.rst  |  2 +-
 docs/system/arm/xscale.rst|  2 +-
 docs/system/devices/can.rst   |  6 ++---
 docs/system/devices/nvme.rst  |  2 +-
 host/include/aarch64/host/cpuinfo.h   |  2 +-
 host/include/generic/host/cpuinfo.h   |  2 +-
 host/include/i386/host/cpuinfo.h  |  2 +-
 host/include/ppc/host/cpuinfo.h   |  2 +-
 hw/9pfs/9p-local.c|  8 +++---
 hw/9pfs/9p-proxy.c|  2 +-
 hw/9pfs/9p-synth.c|  2 +-
 hw/9pfs/9p-util.h |  2 +-
 hw/9pfs/9p.c  |  4 +--
 hw/9pfs/9p.h  |  2 +-
 hw/acpi/aml-build.c   |  6 ++---
 hw/acpi/hmat.c|  2 +-
 hw/acpi/nvdimm.c  |  2 +-
 hw/arm/aspeed.c   |  2 +-
 hw/arm/mps2-tz.c  |  2 +-
 hw/arm/versatilepb.c  |  2 +-
 hw/audio/fmopl.c  |  8 +++---
 hw/audio/fmopl.h  |  2 +-
 hw/audio/gusemu_hal.c |  4 +--
 hw/audio/intel-hda-defs.h |  4 +--
 hw/block/hd-geometry.c|  4 +--
 hw/block/pflash_cfi01.c   |  2 +-
 hw/char/cadence_uart.c|  2 +-
 hw/char/imx_serial.c  |  2 +-
 hw/char/serial.c  |  2 +-
 hw/core/generic-loader.c  |  4 +--
 hw/core/loader.c  |  4 +--
 hw/core/machine.c |  2 +-
 hw/core/qdev-properties-system.c  |  2 +-
 hw/cpu/a15mpcore.c|  2 +-
 hw/cxl/cxl-events.c   |  2 +-
 hw/cxl/cxl-host.c |  2 +-
 hw/cxl/cxl-mailbox-utils.c|  4 +--
 hw/display/bochs-display.c|  2 +-
 hw/display/qxl.c  |  2 +-
 hw/display/ssd0303.c  |  2

Re: [PATCH 0/3] hw/arm/virt: Use generic CPU invalidation

2023-07-13 Thread Richard Henderson


On 7/13/23 13:34, Gavin Shan wrote:

Hi Peter and Marcin,

On 7/13/23 21:52, Marcin Juszkiewicz wrote:

W dniu 13.07.2023 o 13:44, Peter Maydell pisze:


I see this isn't a change in this patch, but given that
what the user specifies is not "cortex-a8-arm-cpu" but
"cortex-a8", why do we include the "-arm-cpu" suffix in
the error messages? It's not valid syntax to say
"-cpu cortex-a8-arm-cpu", so it's a bit misleading...


Internally those cpu names are "max-{TYPE_ARM_CPU}" and similar for other 
architectures.

I like the change but it (IMHO) needs to cut "-{TYPE_*_CPU}" string from names:

13:37 marcin@applejack:qemu$ ./build/aarch64-softmmu/qemu-system-aarch64 -M virt -cpu 
cortex-r5

qemu-system-aarch64: Invalid CPU type: cortex-r5-arm-cpu
The valid types are: cortex-a7-arm-cpu, cortex-a15-arm-cpu, cortex-a35-arm-cpu, 
cortex-a55-arm-cpu, cortex-a72-arm-cpu, cortex-a76-arm-cpu, a64fx-arm-cpu, 
neoverse-n1-arm-cpu, neoverse-v1-arm-cpu, cortex-a53-arm-cpu, cortex-a57-arm-cpu, 
host-arm-cpu, max-arm-cpu


13:37 marcin@applejack:qemu$ ./build/aarch64-softmmu/qemu-system-aarch64 -M virt -cpu 
cortex-a57-arm-cpu

qemu-system-aarch64: unable to find CPU model 'cortex-a57-arm-cpu'



The suffix of CPU types are provided in hw/arm/virt.c::valid_cpu_types in 
PATCH[2].
In the generic validation, the complete CPU type is used. The error message also
have complete CPU type there.

Peter and Marcin, how about to split the CPU types to two fields, as below? In 
this
way, the complete CPU type will be used for validation and the 'internal' names 
will
be used for the error messages.

struct MachineClass {
     const char *valid_cpu_type_suffix;
     const char **valid_cpu_types;


While you're changing this:

const char * const *valid_cpu_types;



};

hw/arm/virt.c
-

static const char *valid_cpu_types[] = {


So that you can then do

static const char * const valid_cpu_types[]



r~

drain_call_rcu() vs nested event loops

2023-07-13 Thread Stefan Hajnoczi

Hi,
I've encountered a bug where two vcpu threads enter a device's MMIO
emulation callback at the same time. This is never supposed to happen
thanks to the Big QEMU Lock (BQL), but drain_call_rcu() and nested event
loops make it possible:

1. A device's MMIO emulation callback invokes AIO_WAIT_WHILE().
2. A device_add monitor command runs in AIO_WAIT_WHILE()'s aio_poll()
   nested event loop.
3. qmp_device_add() -> drain_call_rcu() is called and the BQL is
   temporarily dropped.
4. Another vcpu thread dispatches the same device's MMIO callback
   because it is now able to acquire the BQL.

I've included the backtraces below if you want to see the details. They
are from a RHEL qemu-kvm 6.2.0-35 coredump but I haven't found anything
in qemu.git/master that would fix this.

One fix is to make qmp_device_add() a coroutine and schedule a BH in the
iohandler AioContext. That way the coroutine must wait until the nested
event loop finishes before its BH executes. drain_call_rcu() will never
be called from a nested event loop and the problem does not occur
anymore.

Another possibility is to remove the following in monitor_qmp_dispatcher_co():

  /*
   * Move the coroutine from iohandler_ctx to qemu_aio_context for
   * executing the command handler so that it can make progress if it
   * involves an AIO_WAIT_WHILE().
   */
  aio_co_schedule(qemu_get_aio_context(), qmp_dispatcher_co);
  qemu_coroutine_yield();

By executing QMP commands in the iohandler AioContext by default, we can
prevent issues like this in the future. However, there might be some QMP
commands that assume they are running in the qemu_aio_context (e.g.
coroutine commands that yield) and they might need to manually move to
the qemu_aio_context.

What do you think?

Stefan
---
Thread 41 (Thread 0x7fdc3dffb700 (LWP 910296)):
#0  0x7fde88ac99bd in syscall () from /lib64/libc.so.6
#1  0x55bd7a2e066f in qemu_futex_wait (val=, f=) at 
/usr/src/debug/qemu-kvm-6.2.0-35.module+el8.9.0+19024+8193e2ac.x86_64/include/qemu/futex.h:29
#2  qemu_event_wait (ev=ev@entry=0x7fdc3dffa2d0) at 
../util/qemu-thread-posix.c:510
#3  0x55bd7a2e8e54 in drain_call_rcu () at ../util/rcu.c:347
#4  0x55bd79f63d1e in qmp_device_add (qdict=, 
ret_data=, errp=) at ../softmmu/qdev-monitor.c:863
#5  0x55bd7a2d420d in do_qmp_dispatch_bh (opaque=0x7fde8c22aee0) at 
../qapi/qmp-dispatch.c:129
#6  0x55bd7a2ef3bd in aio_bh_call (bh=0x7fdc6015cd50) at ../util/async.c:174
#7  aio_bh_poll (ctx=ctx@entry=0x55bd7c910f40) at ../util/async.c:174
#8  0x55bd7a2dd3b2 in aio_poll (ctx=0x55bd7c910f40, 
blocking=blocking@entry=true) at ../util/aio-posix.c:659
#9  0x55bd7a2effea in aio_wait_bh_oneshot (ctx=0x55bd7ca980e0, 
cb=cb@entry=0x55bd7a11a9c0 , 
opaque=opaque@entry=0x55bd7e585c40) at ../util/aio-wait.c:85
#10 0x55bd7a11b30b in virtio_blk_data_plane_stop (vdev=) at 
../hw/block/dataplane/virtio-blk.c:333
#11 0x55bd7a0591e0 in virtio_bus_stop_ioeventfd 
(bus=bus@entry=0x55bd7cb57ba8) at ../hw/virtio/virtio-bus.c:258
#12 0x55bd7a05995f in virtio_bus_stop_ioeventfd 
(bus=bus@entry=0x55bd7cb57ba8) at ../hw/virtio/virtio-bus.c:250
#13 0x55bd7a05b238 in virtio_pci_stop_ioeventfd (proxy=0x55bd7cb4f9a0) at 
../hw/virtio/virtio-pci.c:1289
#14 virtio_pci_common_write (opaque=0x55bd7cb4f9a0, addr=, 
val=, size=) at ../hw/virtio/virtio-pci.c:1289
^^
#15 0x55bd7a0f6777 in memory_region_write_accessor (mr=0x55bd7cb50410, 
addr=, value=, size=1, shift=, 
mask=, attrs=...) at ../softmmu/memory.c:492
#16 0x55bd7a0f320e in access_with_adjusted_size (addr=addr@entry=20, 
value=value@entry=0x7fdc3dffa5c8, size=size@entry=1, access_size_min=, access_size_max=, 
access_fn=0x55bd7a0f6710 , mr=0x55bd7cb50410, 
attrs=...) at ../softmmu/memory.c:554
#17 0x55bd7a0f62a3 in memory_region_dispatch_write 
(mr=mr@entry=0x55bd7cb50410, addr=20, data=, op=, 
attrs=attrs@entry=...) at ../softmmu/memory.c:1504
#18 0x55bd7a0e7f2e in flatview_write_continue (fv=fv@entry=0x55bd7d17cad0, 
addr=addr@entry=4236247060, attrs=..., ptr=ptr@entry=0x7fde84003028, 
len=len@entry=1, addr1=, l=, 
mr=0x55bd7cb50410) at 
/usr/src/debug/qemu-kvm-6.2.0-35.module+el8.9.0+19024+8193e2ac.x86_64/include/qemu/host-utils.h:165
#19 0x55bd7a0e8093 in flatview_write (fv=0x55bd7d17cad0, addr=4236247060, 
attrs=..., buf=0x7fde84003028, len=1) at ../softmmu/physmem.c:2856
#20 0x55bd7a0ebc6f in address_space_write (as=, 
addr=, attrs=..., buf=, len=) at 
../softmmu/physmem.c:2952
#21 0x55bd7a1a28b9 in kvm_cpu_exec (cpu=cpu@entry=0x55bd7cc32bf0) at 
../accel/kvm/kvm-all.c:2995
#22 0x55bd7a1a36e5 in kvm_vcpu_thread_fn (arg=0x55bd7cc32bf0) at 
../accel/kvm/kvm-accel-ops.c:49
#23 0x55bd7a2dfdd4 in qemu_thread_start (args=0x55bd7cc41f20) at 
../util/qemu-thread-posix.c:585
#24 0x7fde88e5d1ca in start_thread () from /lib64/libpthread.so.0
#25 0x7fde88ac9e73 in clone () from /lib64/libc.so.6

Thread 1 (Thread 0x7fdc6f5fe

Re: [PATCH v1 13/15] virtio-mem: Expose device memory via multiple memslots if enabled

2023-07-13 Thread Maciej S. Szmigiero


On 16.06.2023 11:26, David Hildenbrand wrote:

Having large virtio-mem devices that only expose little memory to a VM
is currently a problem: we map the whole sparse memory region into the
guest using a single memslot, resulting in one gigantic memslot in KVM.
KVM allocates metadata for the whole memslot, which can result in quite
some memory waste.

Assuming we have a 1 TiB virtio-mem device and only expose little (e.g.,
1 GiB) memory, we would create a single 1 TiB memslot and KVM has to
allocate metadata for that 1 TiB memslot: on x86, this implies allocating
a significant amount of memory for metadata:

(1) RMAP: 8 bytes per 4 KiB, 8 bytes per 2 MiB, 8 bytes per 1 GiB
 -> For 1 TiB: 2147483648 + 4194304 + 8192 = ~ 2 GiB (0.2 %)

 With the TDP MMU (cat /sys/module/kvm/parameters/tdp_mmu) this gets
 allocated lazily when required for nested VMs
(2) gfn_track: 2 bytes per 4 KiB
 -> For 1 TiB: 536870912 = ~512 MiB (0.05 %)
(3) lpage_info: 4 bytes per 2 MiB, 4 bytes per 1 GiB
 -> For 1 TiB: 2097152 + 4096 = ~2 MiB (0.0002 %)
(4) 2x dirty bitmaps for tracking: 2x 1 bit per 4 KiB page
 -> For 1 TiB: 536870912 = 64 MiB (0.006 %)

So we primarily care about (1) and (2). The bad thing is, that the
memory consumption *doubles* once SMM is enabled, because we create the
memslot once for !SMM and once for SMM.

Having a 1 TiB memslot without the TDP MMU consumes around:
* With SMM: 5 GiB
* Without SMM: 2.5 GiB
Having a 1 TiB memslot with the TDP MMU consumes around:
* With SMM: 1 GiB
* Without SMM: 512 MiB

... and that's really something we want to optimize, to be able to just
start a VM with small boot memory (e.g., 4 GiB) and a virtio-mem device
that can grow very large (e.g., 1 TiB).

Consequently, using multiple memslots and only mapping the memslots we
really need can significantly reduce memory waste and speed up
memslot-related operations. Let's expose the sparse RAM memory region using
multiple memslots, mapping only the memslots we currently need into our
device memory region container.

* With VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE, we only map the memslots that
   actually have memory plugged, and dynamically (un)map when
   (un)plugging memory blocks.

* Without VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE, we always map the memslots
   covered by the usable region, and dynamically (un)map when resizing the
   usable region.

We'll auto-determine the number of memslots to use based on the suggested
memslot limit provided by the core. We'll use at most 1 memslot per
gigabyte. Note that our global limit of memslots accross all memory devices
is currently set to 256: even with multiple large virtio-mem devices, we'd
still have a sane limit on the number of memslots used.

The default is a single memslot for now ("multiple-memslots=off"). The
optimization must be enabled manually using "multiple-memslots=on", because
some vhost setups (e.g., hotplug of vhost-user devices) might be
problematic until we support more memslots especially in vhost-user
backends.

Note that "multiple-memslots=on" is just a hint that multiple memslots
*may* be used for internal optimizations, not that multiple memslots
*must* be used. The actual number of memslots that are used is an
internal detail: for example, once memslot metadata is no longer an
issue, we could simply stop optimizing for that. Migration source and
destination can differ on the setting of "multiple-memslots".

Signed-off-by: David Hildenbrand 
---
  hw/virtio/virtio-mem-pci.c |  21 +++
  hw/virtio/virtio-mem.c | 265 -
  include/hw/virtio/virtio-mem.h |  23 ++-
  3 files changed, 304 insertions(+), 5 deletions(-)

diff --git a/hw/virtio/virtio-mem-pci.c b/hw/virtio/virtio-mem-pci.c
index b85c12668d..8b403e7e78 100644
--- a/hw/virtio/virtio-mem-pci.c
+++ b/hw/virtio/virtio-mem-pci.c

(...)

@@ -790,6 +921,43 @@ static void virtio_mem_system_reset(void *opaque)
  virtio_mem_unplug_all(vmem);
  }
  
+static void virtio_mem_prepare_mr(VirtIOMEM *vmem)

+{
+const uint64_t region_size = memory_region_size(&vmem->memdev->mr);
+
+g_assert(!vmem->mr);
+vmem->mr = g_new0(MemoryRegion, 1);
+memory_region_init(vmem->mr, OBJECT(vmem), "virtio-mem",
+   region_size);
+vmem->mr->align = memory_region_get_alignment(&vmem->memdev->mr);
+}
+
+static void virtio_mem_prepare_memslots(VirtIOMEM *vmem)
+{
+const uint64_t region_size = memory_region_size(&vmem->memdev->mr);
+unsigned int idx;
+
+g_assert(!vmem->memslots && vmem->nb_memslots);
+vmem->memslots = g_new0(MemoryRegion, vmem->nb_memslots);
+
+/* Initialize our memslots, but don't map them yet. */
+for (idx = 0; idx < vmem->nb_memslots; idx++) {
+const uint64_t memslot_offset = idx * vmem->memslot_size;
+uint64_t memslot_size = vmem->memslot_size;
+char name[20];
+
+/* The size of the last memslot might be smaller. */
+if (idx == vmem->nb_memslots) {

Re: [RFC, PATCH, trivial, sample] treewide: spelling fixes in comments and some strings

2023-07-13 Thread Michael Tokarev


..

  include/standard-headers/drm/drm_fourcc.h |  8 +++---
  include/standard-headers/linux/ethtool.h  |  2 +-
  .../standard-headers/linux/virtio_console.h   |  2 +-
  include/standard-headers/linux/virtio_i2c.h   |  2 +-
  include/standard-headers/linux/virtio_net.h   |  4 +--


It looks like these should be dropped:

diff --git a/include/standard-headers/drm/drm_fourcc.h 
b/include/standard-headers/drm/drm_fourcc.h
index 72279f4d25..b782802a34 100644
--- a/include/standard-headers/drm/drm_fourcc.h
+++ b/include/standard-headers/drm/drm_fourcc.h
@@ -55,3 +55,3 @@ extern "C" {
  * vendor-namespaced, and as such the relationship between a fourcc code and a
- * modifier is specific to the modifer being used. For example, some modifiers
+ * modifier is specific to the modifier being used. For example, some modifiers
  * may preserve meaning - such as number of planes - from the fourcc code,
@@ -80,3 +80,3 @@ extern "C" {
  *   see modifiers as opaque tokens they can check for equality and intersect.
- *   These users musn't need to know to reason about the modifier value
+ *   These users mustn't need to know to reason about the modifier value
  *   (i.e. they are not expected to extract information out of the modifier).
@@ -539,3 +539,3 @@ extern "C" {
  * are arranged in four groups (two wide, two high) with column-major layout.
- * Each group therefore consits out of four 256 byte units, which are also laid
+ * Each group therefore consists out of four 256 byte units, which are also 
laid
  * out as 2x2 column-major.
@@ -1418,3 +1418,3 @@ drm_fourcc_canonicalize_nvidia_format_mod(uint64_t 
modifier)
  * Indicates the storage is packed when pixel size is multiple of word
- * boudaries, i.e. 8bit should be stored in this mode to save allocation
+ * boundaries, i.e. 8bit should be stored in this mode to save allocation
  * memory.
diff --git a/include/standard-headers/linux/ethtool.h 
b/include/standard-headers/linux/ethtool.h
index 99fcddf04f..e72ec42bf8 100644
--- a/include/standard-headers/linux/ethtool.h
+++ b/include/standard-headers/linux/ethtool.h
@@ -571,3 +571,3 @@ struct ethtool_channels {
  * Drivers should reject a non-zero setting of @autoneg when
- * autoneogotiation is disabled (or not supported) for the link.
+ * autonegotiation is disabled (or not supported) for the link.
  *
diff --git a/include/standard-headers/linux/virtio_console.h 
b/include/standard-headers/linux/virtio_console.h
index 71f5f648e3..8cba52198a 100644
--- a/include/standard-headers/linux/virtio_console.h
+++ b/include/standard-headers/linux/virtio_console.h
@@ -46,3 +46,3 @@
 struct virtio_console_config {
-   /* colums of the screens */
+   /* columns of the screens */
__virtio16 cols;
...

[PATCH for-8.1] tcg: Use HAVE_CMPXCHG128 instead of CONFIG_CMPXCHG128

2023-07-13 Thread Richard Henderson

We adjust CONFIG_ATOMIC128 and CONFIG_CMPXCHG128 with
CONFIG_ATOMIC128_OPT in atomic128.h.  It is difficult
to tell when those changes have been applied with the
ifdef we must use with CONFIG_CMPXCHG128.  So instead
use HAVE_CMPXCHG128, which triggers -Werror-undef when
the proper header has not been included.

Improves tcg_gen_atomic_cmpxchg_i128 for s390x host, which
requires CONFIG_ATOMIC128_OPT.  Without this we fall back
to EXCP_ATOMIC to single-step 128-bit atomics, which is
slow enough to cause some tests to time out.

Reported-by: Thomas Huth 
Signed-off-by: Richard Henderson 
---

Thomas, this issue does not quite match the one you bisected, but
other than the cmpxchg, I don't see any see any qemu_{ld,st}_i128
being used in BootLinuxS390X.test_s390_ccw_virtio_tcg.

As far as I can see, this wasn't broken by the addition of
CONFIG_ATOMIC128_OPT, rather that fix didn't go far enough.

Anyway, test_s390_ccw_virtio_tcg now passes in 159s on our host.


r~
---
 accel/tcg/tcg-runtime.h| 2 +-
 include/exec/helper-proto-common.h | 2 ++
 accel/tcg/cputlb.c | 2 +-
 accel/tcg/user-exec.c  | 2 +-
 tcg/tcg-op-ldst.c  | 2 +-
 accel/tcg/atomic_common.c.inc  | 2 +-
 6 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/accel/tcg/tcg-runtime.h b/accel/tcg/tcg-runtime.h
index 39e68007f9..186899a2c7 100644
--- a/accel/tcg/tcg-runtime.h
+++ b/accel/tcg/tcg-runtime.h
@@ -58,7 +58,7 @@ DEF_HELPER_FLAGS_5(atomic_cmpxchgq_be, TCG_CALL_NO_WG,
 DEF_HELPER_FLAGS_5(atomic_cmpxchgq_le, TCG_CALL_NO_WG,
i64, env, i64, i64, i64, i32)
 #endif
-#ifdef CONFIG_CMPXCHG128
+#if HAVE_CMPXCHG128
 DEF_HELPER_FLAGS_5(atomic_cmpxchgo_be, TCG_CALL_NO_WG,
i128, env, i64, i128, i128, i32)
 DEF_HELPER_FLAGS_5(atomic_cmpxchgo_le, TCG_CALL_NO_WG,
diff --git a/include/exec/helper-proto-common.h 
b/include/exec/helper-proto-common.h
index 4d4b022668..8b67170a22 100644
--- a/include/exec/helper-proto-common.h
+++ b/include/exec/helper-proto-common.h
@@ -7,6 +7,8 @@
 #ifndef HELPER_PROTO_COMMON_H
 #define HELPER_PROTO_COMMON_H
 
+#include "qemu/atomic128.h"  /* for HAVE_CMPXCHG128 */
+
 #define HELPER_H "accel/tcg/tcg-runtime.h"
 #include "exec/helper-proto.h.inc"
 #undef  HELPER_H
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index c2b81ec569..e0079c9a9d 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -3105,7 +3105,7 @@ void cpu_st16_mmu(CPUArchState *env, target_ulong addr, 
Int128 val,
 #include "atomic_template.h"
 #endif
 
-#if defined(CONFIG_ATOMIC128) || defined(CONFIG_CMPXCHG128)
+#if defined(CONFIG_ATOMIC128) || HAVE_CMPXCHG128
 #define DATA_SIZE 16
 #include "atomic_template.h"
 #endif
diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c
index d95b875a6a..e7225e10e9 100644
--- a/accel/tcg/user-exec.c
+++ b/accel/tcg/user-exec.c
@@ -1385,7 +1385,7 @@ static void *atomic_mmu_lookup(CPUArchState *env, vaddr 
addr, MemOpIdx oi,
 #include "atomic_template.h"
 #endif
 
-#if defined(CONFIG_ATOMIC128) || defined(CONFIG_CMPXCHG128)
+#if defined(CONFIG_ATOMIC128) || HAVE_CMPXCHG128
 #define DATA_SIZE 16
 #include "atomic_template.h"
 #endif
diff --git a/tcg/tcg-op-ldst.c b/tcg/tcg-op-ldst.c
index 0fcc1618e5..d54c305598 100644
--- a/tcg/tcg-op-ldst.c
+++ b/tcg/tcg-op-ldst.c
@@ -778,7 +778,7 @@ typedef void (*gen_atomic_op_i64)(TCGv_i64, TCGv_env, 
TCGv_i64,
 #else
 # define WITH_ATOMIC64(X)
 #endif
-#ifdef CONFIG_CMPXCHG128
+#if HAVE_CMPXCHG128
 # define WITH_ATOMIC128(X) X,
 #else
 # define WITH_ATOMIC128(X)
diff --git a/accel/tcg/atomic_common.c.inc b/accel/tcg/atomic_common.c.inc
index ee222fd7e7..95a5c5ff12 100644
--- a/accel/tcg/atomic_common.c.inc
+++ b/accel/tcg/atomic_common.c.inc
@@ -41,7 +41,7 @@ CMPXCHG_HELPER(cmpxchgq_be, uint64_t)
 CMPXCHG_HELPER(cmpxchgq_le, uint64_t)
 #endif
 
-#ifdef CONFIG_CMPXCHG128
+#if HAVE_CMPXCHG128
 CMPXCHG_HELPER(cmpxchgo_be, Int128)
 CMPXCHG_HELPER(cmpxchgo_le, Int128)
 #endif
-- 
2.34.1

Re: [PATCH for-8.2 v2 5/7] target/riscv/cpu.c: add a ADD_CPU_PROPERTIES_ARRAY() macro

2023-07-13 Thread Richard Henderson


On 7/12/23 21:57, Daniel Henrique Barboza wrote:

+#define ADD_CPU_PROPERTIES_ARRAY(_dev, _array) \
+for (prop = _array; prop && prop->name; prop++) { \
+qdev_property_add_static(_dev, prop); \
+} \


do { } while(0)

Watch the \ on the last line of the macro.
Declare the iterator within the macro, rather than use one defined in the outer 
scope.
Why not use ARRAY_SIZE?


r~

[PATCH 10/18] target/arm: Use clmul_32* routines

2023-07-13 Thread Richard Henderson

Use generic routines for 32-bit carry-less multiply.
Remove our local version of pmull_d.

Signed-off-by: Richard Henderson 
---
 target/arm/tcg/vec_helper.c | 14 +-
 1 file changed, 1 insertion(+), 13 deletions(-)

diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
index 1b1d5fccbc..c81447e674 100644
--- a/target/arm/tcg/vec_helper.c
+++ b/target/arm/tcg/vec_helper.c
@@ -2057,18 +2057,6 @@ void HELPER(sve2_pmull_h)(void *vd, void *vn, void *vm, 
uint32_t desc)
 }
 }
 
-static uint64_t pmull_d(uint64_t op1, uint64_t op2)
-{
-uint64_t result = 0;
-int i;
-
-for (i = 0; i < 32; ++i) {
-uint64_t mask = -((op1 >> i) & 1);
-result ^= (op2 << i) & mask;
-}
-return result;
-}
-
 void HELPER(sve2_pmull_d)(void *vd, void *vn, void *vm, uint32_t desc)
 {
 intptr_t sel = H4(simd_data(desc));
@@ -2077,7 +2065,7 @@ void HELPER(sve2_pmull_d)(void *vd, void *vn, void *vm, 
uint32_t desc)
 uint64_t *d = vd;
 
 for (i = 0; i < opr_sz / 8; ++i) {
-d[i] = pmull_d(n[2 * i + sel], m[2 * i + sel]);
+d[i] = clmul_32(n[2 * i + sel], m[2 * i + sel]);
 }
 }
 #endif
-- 
2.34.1

[PATCH 03/18] target/s390x: Use clmul_8* routines

2023-07-13 Thread Richard Henderson

Use generic routines for 8-bit carry-less multiply.
Remove our local version of galois_multiply8.

Signed-off-by: Richard Henderson 
---
 target/s390x/tcg/vec_int_helper.c | 27 ---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/target/s390x/tcg/vec_int_helper.c 
b/target/s390x/tcg/vec_int_helper.c
index 53ab5c5eb3..e110a7581a 100644
--- a/target/s390x/tcg/vec_int_helper.c
+++ b/target/s390x/tcg/vec_int_helper.c
@@ -14,6 +14,7 @@
 #include "vec.h"
 #include "exec/helper-proto.h"
 #include "tcg/tcg-gvec-desc.h"
+#include "crypto/clmul.h"
 
 static bool s390_vec_is_zero(const S390Vector *v)
 {
@@ -179,7 +180,6 @@ static uint##TBITS##_t 
galois_multiply##BITS(uint##TBITS##_t a,\
 }  
\
 return res;
\
 }
-DEF_GALOIS_MULTIPLY(8, 16)
 DEF_GALOIS_MULTIPLY(16, 32)
 DEF_GALOIS_MULTIPLY(32, 64)
 
@@ -203,6 +203,29 @@ static S390Vector galois_multiply64(uint64_t a, uint64_t b)
 return res;
 }
 
+static Int128 do_gfm8(Int128 n, Int128 m)
+{
+Int128 e = clmul_8x8_even(n, m);
+Int128 o = clmul_8x8_odd(n, m);
+return int128_xor(e, o);
+}
+
+void HELPER(gvec_vgfm8)(void *v1, const void *v2, const void *v3, uint32_t d)
+{
+/*
+ * There is no carry across the two doublewords, so their order
+ * does not matter, so we need not care for host endianness.
+ */
+*(Int128 *)v1 = do_gfm8(*(const Int128 *)v2, *(const Int128 *)v3);
+}
+
+void HELPER(gvec_vgfma8)(void *v1, const void *v2, const void *v3,
+ const void *v4, uint32_t d)
+{
+Int128 r = do_gfm8(*(const Int128 *)v2, *(const Int128 *)v3);
+*(Int128 *)v1 = int128_xor(r, *(Int128 *)v4);
+}
+
 #define DEF_VGFM(BITS, TBITS)  
\
 void HELPER(gvec_vgfm##BITS)(void *v1, const void *v2, const void *v3, 
\
  uint32_t desc)
\
@@ -220,7 +243,6 @@ void HELPER(gvec_vgfm##BITS)(void *v1, const void *v2, 
const void *v3, \
 s390_vec_write_element##TBITS(v1, i, d);   
\
 }  
\
 }
-DEF_VGFM(8, 16)
 DEF_VGFM(16, 32)
 DEF_VGFM(32, 64)
 
@@ -257,7 +279,6 @@ void HELPER(gvec_vgfma##BITS)(void *v1, const void *v2, 
const void *v3,\
 s390_vec_write_element##TBITS(v1, i, d);   
\
 }  
\
 }
-DEF_VGFMA(8, 16)
 DEF_VGFMA(16, 32)
 DEF_VGFMA(32, 64)
 
-- 
2.34.1

[PATCH 16/18] target/ppc: Use clmul_64

2023-07-13 Thread Richard Henderson

Use generic routine for 64-bit carry-less multiply.

Signed-off-by: Richard Henderson 
---
 target/ppc/int_helper.c | 17 +++--
 1 file changed, 3 insertions(+), 14 deletions(-)

diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index 828f04bce7..4e1fa2fd68 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1455,20 +1455,9 @@ void helper_vpmsumw(ppc_avr_t *r, ppc_avr_t *a, 
ppc_avr_t *b)
 
 void helper_VPMSUMD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
 {
-int i, j;
-Int128 tmp, prod[2] = {int128_zero(), int128_zero()};
-
-for (j = 0; j < 64; j++) {
-for (i = 0; i < ARRAY_SIZE(r->u64); i++) {
-if (a->VsrD(i) & (1ull << j)) {
-tmp = int128_make64(b->VsrD(i));
-tmp = int128_lshift(tmp, j);
-prod[i] = int128_xor(prod[i], tmp);
-}
-}
-}
-
-r->s128 = int128_xor(prod[0], prod[1]);
+Int128 e = clmul_64(a->u64[0], b->u64[0]);
+Int128 o = clmul_64(a->u64[1], b->u64[1]);
+r->s128 = int128_xor(e, o);
 }
 
 #if HOST_BIG_ENDIAN
-- 
2.34.1

[RFC PATCH for-8.2 00/18] crypto: Provide clmul.h and host accel

2023-07-13 Thread Richard Henderson

Inspired by Ard Biesheuvel's RFC patches [1] for accelerating
carry-less multiply under emulation.

This is less polished than the AES patch set:

(1) Should I split HAVE_CLMUL_ACCEL into per-width HAVE_CLMUL{N}_ACCEL?
The "_generic" and "_accel" split is different from aes-round.h
because of the difference in support for different widths, and it
means that each host accel has more boilerplate.

(2) Should I bother trying to accelerate anything other than 64x64->128?
That seems to be the one that GSM really wants anyway.  I'd keep all
of the sizes implemented generically, since that centralizes the 3
target implementations.

(3) The use of Int128 isn't fantastic -- better would be a vector type,
though that has its own special problems for ppc64le (see the
endianness hoops within aes-round.h).  Perhaps leave things in
env memory, like I was mostly able to do with AES?

(4) No guest test case(s).


r~


[1] https://patchew.org/QEMU/20230601123332.3297404-1-a...@kernel.org/

Richard Henderson (18):
  crypto: Add generic 8-bit carry-less multiply routines
  target/arm: Use clmul_8* routines
  target/s390x: Use clmul_8* routines
  target/ppc: Use clmul_8* routines
  crypto: Add generic 16-bit carry-less multiply routines
  target/arm: Use clmul_16* routines
  target/s390x: Use clmul_16* routines
  target/ppc: Use clmul_16* routines
  crypto: Add generic 32-bit carry-less multiply routines
  target/arm: Use clmul_32* routines
  target/s390x: Use clmul_32* routines
  target/ppc: Use clmul_32* routines
  crypto: Add generic 64-bit carry-less multiply routine
  target/arm: Use clmul_64
  target/s390x: Use clmul_64
  target/ppc: Use clmul_64
  host/include/i386: Implement clmul.h
  host/include/aarch64: Implement clmul.h

 host/include/aarch64/host/cpuinfo.h  |   1 +
 host/include/aarch64/host/crypto/clmul.h | 230 +++
 host/include/generic/host/crypto/clmul.h |  28 +++
 host/include/i386/host/cpuinfo.h |   1 +
 host/include/i386/host/crypto/clmul.h| 187 ++
 host/include/x86_64/host/crypto/clmul.h  |   1 +
 include/crypto/clmul.h   | 123 
 target/arm/tcg/vec_internal.h|  11 --
 crypto/clmul.c   | 163 
 target/arm/tcg/mve_helper.c  |  16 +-
 target/arm/tcg/vec_helper.c  | 112 ++-
 target/ppc/int_helper.c  |  63 +++
 target/s390x/tcg/vec_int_helper.c| 175 +++--
 util/cpuinfo-aarch64.c   |   4 +-
 util/cpuinfo-i386.c  |   1 +
 crypto/meson.build   |   9 +-
 16 files changed, 865 insertions(+), 260 deletions(-)
 create mode 100644 host/include/aarch64/host/crypto/clmul.h
 create mode 100644 host/include/generic/host/crypto/clmul.h
 create mode 100644 host/include/i386/host/crypto/clmul.h
 create mode 100644 host/include/x86_64/host/crypto/clmul.h
 create mode 100644 include/crypto/clmul.h
 create mode 100644 crypto/clmul.c

-- 
2.34.1

1 2 >

1 - 100 of 160 matches

Mail list logo