Re: [PULL v2 25/25] qga/linux: Add new api 'guest-network-get-route'

2024-08-19 Thread Konstantin Kostiuk
On Thu, Aug 15, 2024 at 5:18 PM Peter Maydell 
wrote:

> On Mon, 29 Jul 2024 at 10:35, Peter Maydell 
> wrote:
> >
> > On Mon, 29 Jul 2024 at 08:40, Konstantin Kostiuk 
> wrote:
> > >
> > > Hi Peter,
> > >
> > > How to see the full coverity report? In
> https://gitlab.com/qemu-project/qemu/-/artifacts, I see only job.log
> > > Do you expect to fix these errors for the 9.1 release?
> >
> > Coverity errors are in https://scan.coverity.com/projects/qemu
> >  -- you can ask for an account with the project if you want
> > to see them directly. But I think you have the information
> > you need in this email: the actual coverity issue isn't
> > much more informative.
> >
> > > Do you expect to fix these errors for the 9.1 release?
> >
> > No, I post these emails to inform the people responsible
> > for the original commits about the problem so that they
> > can provide fixes -- after all, it's the original author
> > that knows most about the code and how to test it.
>
> Konstantin, are you or Dehan planning to write fixes
> for these bugs?
>

Hi Peter,

Yes, we plan to fix these bugs for the 9.2 release.

Best Regards,
Konstantin Kostiuk.


>
> thanks
> -- PMM
>
>


[PATCH for-9.1] target/i386: Fix tss access size in switch_tss_ra

2024-08-19 Thread Richard Henderson
The two limit_max variables represent size - 1, just like the
encoding in the GDT, thus the 'old' access was off by one.
Access the minimal size of the new tss: the complete tss contains
the iopb, which may be a larger block than the access api expects,
and irrelevant because the iopb is not accessed during the
switch itself.

Fixes: 8b131065080a ("target/i386/tcg: use X86Access for TSS access")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2511
Signed-off-by: Richard Henderson 
---
 target/i386/tcg/seg_helper.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/target/i386/tcg/seg_helper.c b/target/i386/tcg/seg_helper.c
index bab552cd53..3b8fd827e1 100644
--- a/target/i386/tcg/seg_helper.c
+++ b/target/i386/tcg/seg_helper.c
@@ -378,7 +378,7 @@ static int switch_tss_ra(CPUX86State *env, int tss_selector,
 
 /* X86Access avoids memory exceptions during the task switch */
 mmu_index = cpu_mmu_index_kernel(env);
-access_prepare_mmu(&old, env, env->tr.base, old_tss_limit_max,
+access_prepare_mmu(&old, env, env->tr.base, old_tss_limit_max + 1,
MMU_DATA_STORE, mmu_index, retaddr);
 
 if (source == SWITCH_TSS_CALL) {
@@ -386,7 +386,8 @@ static int switch_tss_ra(CPUX86State *env, int tss_selector,
 probe_access(env, tss_base, 2, MMU_DATA_STORE,
  mmu_index, retaddr);
 }
-access_prepare_mmu(&new, env, tss_base, tss_limit,
+/* While true tss_limit may be larger, we don't access the iopb here. */
+access_prepare_mmu(&new, env, tss_base, tss_limit_max + 1,
MMU_DATA_LOAD, mmu_index, retaddr);
 
 /* save the current state in the old TSS */
-- 
2.43.0




Re: [PULL 0/1] riscv-to-apply queue

2024-08-19 Thread Richard Henderson

On 8/19/24 14:43, Alistair Francis wrote:

The following changes since commit 2eefd4fcec4b8fe41ceee2a8f00cdec1fe81b75c:

   Merge tag 'pull-maintainer-9.1-rc3-160824-1' 
ofhttps://gitlab.com/stsquad/qemu into staging (2024-08-17 16:46:45 +1000)

are available in the Git repository at:

   https://github.com/alistair23/qemu.git tags/pull-riscv-to-apply-20240819-1

for you to fetch changes up to 6df664f87c738788891f3bda701e63e23a0dbbc2:

   Revert "hw/riscv/virt.c: imsics DT: add '#msi-cells'" (2024-08-19 14:34:49 
+1000)


RISC-V PR for 9.1

This reverts a commit adding `#msi-cells=<0>` to the virt machine
as that commit results in PCI devices unable to us MSIs. Even though
it's a kernel bug, we don't want to break existing users.

* Revert adding #msi-cells to virt machine


Applied, thanks.  Please update https://wiki.qemu.org/ChangeLog/9.1 as 
appropriate.

r~



Re: [PATCH 2/2] hw/riscv/virt: Introduce strict-dt

2024-08-19 Thread Andrew Jones
On Mon, Aug 19, 2024 at 11:19:18AM GMT, Alistair Francis wrote:
> On Sat, Aug 17, 2024 at 2:08 AM Andrew Jones  wrote:
> >
> > Older firmwares and OS kernels which use deprecated device tree
> > properties or are missing support for new properties may not be
> > tolerant of fully compliant device trees. When divergence to the
> > bindings specifications is harmless for new firmwares and OS kernels
> > which are compliant, then it's probably better to also continue
> > supporting the old firmwares and OS kernels by generating
> > non-compliant device trees. The '#msi-cells=<0>' property of the
> > imsic is one such property. Generating that property doesn't provide
> > anything necessary (no '#msi-cells' property or an '#msi-cells'
> > property with a value of zero mean the same thing) but it does
> > cause PCI devices to fail to find the MSI controller on Linux and,
> > for that reason, riscv virt doesn't currently generate it despite
> > that putting the DT out of compliance. For users that want a
> > compliant DT and know their software supports it, introduce a machine
> > property 'strict-dt' to do so. We also drop the one redundant
> > property that uses a deprecated name when strict-dt is enabled.
> >
> > Signed-off-by: Andrew Jones 
> > ---
> >  docs/system/riscv/virt.rst | 11 ++
> >  hw/riscv/virt.c| 43 ++
> >  include/hw/riscv/virt.h|  1 +
> >  3 files changed, 46 insertions(+), 9 deletions(-)
> >
> > diff --git a/docs/system/riscv/virt.rst b/docs/system/riscv/virt.rst
> > index 9a06f95a3444..f08d0a053051 100644
> > --- a/docs/system/riscv/virt.rst
> > +++ b/docs/system/riscv/virt.rst
> > @@ -116,6 +116,17 @@ The following machine-specific options are supported:
> >having AIA IMSIC (i.e. "aia=aplic-imsic" selected). When not specified,
> >the default number of per-HART VS-level AIA IMSIC pages is 0.
> >
> > +- strict-dt=[on|off]
> 
> Hmm... I don't love the idea of having yet another command line option.
> 
> Does this really buy us a lot? Eventually we should deprecate the
> invalid DT bindings anyway

I agree we should deprecate the invalid DT usage, with the goal of only
generating DTs that make the validator happy. I'm not sure how long that
deprecation period should be, though. It may need to be a while since
we'll need to decide when we've waited long enough to no longer care
about older kernels. In the meantime, we won't be making the validator
happy and may get bug reports due to that. With strct-dt we can just
direct people in that direction. Also, I wouldn't be surprised if
something else like this comes along some day, which is why I tried to
make the option as generic as possible. Finally, the 'if (strict_dt)'
self-documents to some extent. Otherwise we'll need to add comments
around explaining why we're diverging from the specs. Although we should
probably do that anyway, i.e. I should have put a comment on the
'if (strict-dt) then #msi-cells' explaining why it's under strict-dt.
If we want strict-dt, then I'll send a v2 doing that. If we don't want
strict-dt then I'll send a v2 with just a comment explaining why
#msi-cells was left out.

Thanks,
drew



[PATCH qemu v5 1/1] target/riscv: Add Zilsd and Zclsd extension support

2024-08-19 Thread ~liuxu
From: lxx <1733205...@qq.com>

This patch adds support for the Zilsd and Zclsd extension,
which is documented at https://github.com/riscv/riscv-zilsd/releases/tag/v0.9.0

Co-developed-by: SUN Dongya 
Co-developed-by: LIU Xu 
Co-developed-by: ZHAO Fujin 
---
 target/riscv/cpu.c|   4 +
 target/riscv/cpu_cfg.h|   2 +
 target/riscv/insn16.decode|   8 ++
 target/riscv/insn32.decode|  12 ++-
 target/riscv/insn_trans/trans_zilsd.c.inc | 112 ++
 target/riscv/tcg/tcg-cpu.c|  16 
 target/riscv/translate.c  |   1 +
 7 files changed, 153 insertions(+), 2 deletions(-)
 create mode 100644 target/riscv/insn_trans/trans_zilsd.c.inc

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 36e3e5fdaf..be9746d361 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -113,6 +113,7 @@ const RISCVIsaExtData isa_edata_arr[] = {
 ISA_EXT_DATA_ENTRY(zihintntl, PRIV_VERSION_1_10_0, ext_zihintntl),
 ISA_EXT_DATA_ENTRY(zihintpause, PRIV_VERSION_1_10_0, ext_zihintpause),
 ISA_EXT_DATA_ENTRY(zihpm, PRIV_VERSION_1_12_0, ext_zihpm),
+ISA_EXT_DATA_ENTRY(zilsd, PRIV_VERSION_1_12_0, ext_zilsd),
 ISA_EXT_DATA_ENTRY(zmmul, PRIV_VERSION_1_12_0, ext_zmmul),
 ISA_EXT_DATA_ENTRY(za64rs, PRIV_VERSION_1_12_0, has_priv_1_11),
 ISA_EXT_DATA_ENTRY(zaamo, PRIV_VERSION_1_12_0, ext_zaamo),
@@ -132,6 +133,7 @@ const RISCVIsaExtData isa_edata_arr[] = {
 ISA_EXT_DATA_ENTRY(zce, PRIV_VERSION_1_12_0, ext_zce),
 ISA_EXT_DATA_ENTRY(zcmp, PRIV_VERSION_1_12_0, ext_zcmp),
 ISA_EXT_DATA_ENTRY(zcmt, PRIV_VERSION_1_12_0, ext_zcmt),
+ISA_EXT_DATA_ENTRY(zclsd, PRIV_VERSION_1_12_0, ext_zclsd),
 ISA_EXT_DATA_ENTRY(zba, PRIV_VERSION_1_12_0, ext_zba),
 ISA_EXT_DATA_ENTRY(zbb, PRIV_VERSION_1_12_0, ext_zbb),
 ISA_EXT_DATA_ENTRY(zbc, PRIV_VERSION_1_12_0, ext_zbc),
@@ -1492,6 +1494,7 @@ const RISCVCPUMultiExtConfig riscv_cpu_extensions[] = {
 
 MULTI_EXT_CFG_BOOL("zicntr", ext_zicntr, true),
 MULTI_EXT_CFG_BOOL("zihpm", ext_zihpm, true),
+MULTI_EXT_CFG_BOOL("zilsd", ext_zilsd, false),
 
 MULTI_EXT_CFG_BOOL("zba", ext_zba, true),
 MULTI_EXT_CFG_BOOL("zbb", ext_zbb, true),
@@ -1531,6 +1534,7 @@ const RISCVCPUMultiExtConfig riscv_cpu_extensions[] = {
 MULTI_EXT_CFG_BOOL("zcmp", ext_zcmp, false),
 MULTI_EXT_CFG_BOOL("zcmt", ext_zcmt, false),
 MULTI_EXT_CFG_BOOL("zicond", ext_zicond, false),
+MULTI_EXT_CFG_BOOL("zclsd", ext_zclsd, false),
 
 /* Vector cryptography extensions */
 MULTI_EXT_CFG_BOOL("zvbb", ext_zvbb, false),
diff --git a/target/riscv/cpu_cfg.h b/target/riscv/cpu_cfg.h
index cb750154bd..76ae1e95d7 100644
--- a/target/riscv/cpu_cfg.h
+++ b/target/riscv/cpu_cfg.h
@@ -51,6 +51,7 @@ struct RISCVCPUConfig {
 bool ext_zcf;
 bool ext_zcmp;
 bool ext_zcmt;
+bool ext_zclsd;
 bool ext_zk;
 bool ext_zkn;
 bool ext_zknd;
@@ -71,6 +72,7 @@ struct RISCVCPUConfig {
 bool ext_zihintntl;
 bool ext_zihintpause;
 bool ext_zihpm;
+bool ext_zilsd;
 bool ext_ztso;
 bool ext_smstateen;
 bool ext_sstc;
diff --git a/target/riscv/insn16.decode b/target/riscv/insn16.decode
index b96c534e73..bcd94e41e1 100644
--- a/target/riscv/insn16.decode
+++ b/target/riscv/insn16.decode
@@ -130,10 +130,14 @@ sw110  ... ... .. ... 00 @cs_w
 {
   ld  011  ... ... .. ... 00 @cl_d
   c_flw   011  ... ... .. ... 00 @cl_w
+  # *** Zclsd Extension ***
+  zclsd_ld011  ... ... .. ... 00 @cl_d
 }
 {
   sd  111  ... ... .. ... 00 @cs_d
   c_fsw   111  ... ... .. ... 00 @cs_w
+  # *** Zclsd Extension ***
+  zclsd_sd111  ... ... .. ... 00 @cs_d
 }
 
 # *** RV32/64C Standard Extension (Quadrant 1) ***
@@ -207,10 +211,14 @@ sw110 .  .  . 10 @c_swsp
   c64_illegal 011 -  0  - 10 # c.ldsp, RES rd=0
   ld  011 .  .  . 10 @c_ldsp
   c_flw   011 .  .  . 10 @c_lwsp
+  # *** Zclsd Extension ***
+  zclsd_ldsp  011 .  .  . 10 @c_ldsp
 }
 {
   sd  111 .  .  . 10 @c_sdsp
   c_fsw   111 .  .  . 10 @c_swsp
+  # *** Zclsd Extension ***
+  zclsd_sd111 .  .  . 10 @c_sdsp
 }
 
 # *** RV64 and RV32 Zcb Extension ***
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index f22df04cfd..f6f4b7950b 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -169,8 +169,16 @@ csrrci    . 111 . 1110011 @csr
 
 # *** RV64I Base Instruction Set (in addition to RV32I) ***
 lwu     . 110 . 011 @i
-ld      . 011 . 011 @i
-sd   ... .  . 011 . 0100011 @s
+{
+  ld      . 011 . 011 @i
+  # *** Zilsd instructions ***
+  zilsd_ld    . 011 . 011 @i
+}
+{
+  sd   ... . 

[PATCH qemu v5 0/1] target/riscv: Add Zilsd and Zclsd extension support

2024-08-19 Thread ~liuxu
Fix for the last reply:
https://lists.gnu.org/archive/html/qemu-devel/2024-08/msg02469.html

lxx (1):
  target/riscv: Add Zilsd and Zclsd extension support

 target/riscv/cpu.c|   4 +
 target/riscv/cpu_cfg.h|   2 +
 target/riscv/insn16.decode|   8 ++
 target/riscv/insn32.decode|  12 ++-
 target/riscv/insn_trans/trans_zilsd.c.inc | 112 ++
 target/riscv/tcg/tcg-cpu.c|  16 
 target/riscv/translate.c  |   1 +
 7 files changed, 153 insertions(+), 2 deletions(-)
 create mode 100644 target/riscv/insn_trans/trans_zilsd.c.inc

-- 
2.45.2



Re: [PATCH 2/2] hw/riscv/virt: Introduce strict-dt

2024-08-19 Thread Richard Henderson

On 8/19/24 17:50, Andrew Jones wrote:

I agree we should deprecate the invalid DT usage, with the goal of only
generating DTs that make the validator happy. I'm not sure how long that
deprecation period should be, though. It may need to be a while since
we'll need to decide when we've waited long enough to no longer care
about older kernels.


This is the kind of thing versioned machine models are good for.

For instance, for the next release define virt-9.1 and virt-9.2.
Set strict-dt in virt-9.2 and reset it in virt-9.1.

C.f. hw/arm/virt.c, where virt_machine_8_2_options invokes virt_machine_9_0_options and 
then adjusts for backward compatibility.



r~



Re: [PATCH v2 04/17] intel_iommu: Flush stage-2 cache in PASID-selective PASID-based iotlb invalidation

2024-08-19 Thread Yi Liu

On 2024/8/15 13:48, Duan, Zhenzhong wrote:




-Original Message-
From: Liu, Yi L 
Subject: Re: [PATCH v2 04/17] intel_iommu: Flush stage-2 cache in PASID-
selective PASID-based iotlb invalidation

On 2024/8/5 14:27, Zhenzhong Duan wrote:

Per spec 6.5.2.4, PADID-selective PASID-based iotlb invalidation will
flush stage-2 iotlb entries with matching domain id and pasid.

With scalable modern mode introduced, guest could send PASID-selective
PASID-based iotlb invalidation to flush both stage-1 and stage-2 entries.


I'm not quite sure if this is correct. In the last collumn of the table 21
in 6.5.2.4, the paging structures of SS will not be invalidated. So it's
not quite recommended for software to invalidate the iotlb entries with
PGTT==SS-only by P_IOTLB invalidation, it's more recommended to use the
IOTLB invalidation.


Hmm, when pasid is used with SS-only, PASID-based iotlb invalidation can
give better granularity, (DID,PASID) vs. (DID) for IOTLB invalidation.


But you may need to submit multiple PASID-based iotlb invalidations as a SS
page table can be used by multiple pasid entries. From this pesperctive,
issuing a single IOTLB invalidation is simpler for programmer. And this is
also what VT-d spec recommends in
"Table 25. Guidance to Software for Invalidations".

In fact, for leaf modifications, the iommu driver should use the
page-selective granularity instead of pasid-selective. However, the line
3/column 2 of "Table 21. PASID-based-IOTLB Invalidation" says it is NA to
the iotlb entries with PGTT==010b.



If non-leaf SS-paging entry is updated, IOTLB invalidation should be used
as SS-paging structure cache isn't flushed with PASID-selective PASID-based
iotlb invalidation.


yes in concept.






By this chance, remove old IOTLB related definition.

Signed-off-by: Zhenzhong Duan 
---
   hw/i386/intel_iommu_internal.h | 14 +++---
   hw/i386/intel_iommu.c  | 81

++

   2 files changed, 90 insertions(+), 5 deletions(-)

diff --git a/hw/i386/intel_iommu_internal.h

b/hw/i386/intel_iommu_internal.h

index 8fa27c7f3b..19e4ed52ca 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -402,11 +402,6 @@ typedef union VTDInvDesc VTDInvDesc;
   #define VTD_INV_DESC_IOTLB_AM(val)  ((val) & 0x3fULL)
   #define VTD_INV_DESC_IOTLB_RSVD_LO  0xff00ULL
   #define VTD_INV_DESC_IOTLB_RSVD_HI  0xf80ULL
-#define VTD_INV_DESC_IOTLB_PASID_PASID  (2ULL << 4)
-#define VTD_INV_DESC_IOTLB_PASID_PAGE   (3ULL << 4)
-#define VTD_INV_DESC_IOTLB_PASID(val)   (((val) >> 32) &

VTD_PASID_ID_MASK)

-#define VTD_INV_DESC_IOTLB_PASID_RSVD_LO

0xfff001c0ULL

-#define VTD_INV_DESC_IOTLB_PASID_RSVD_HI  0xf80ULL

   /* Mask for Device IOTLB Invalidate Descriptor */
   #define VTD_INV_DESC_DEVICE_IOTLB_ADDR(val) ((val) &

0xf000ULL)

@@ -438,6 +433,15 @@ typedef union VTDInvDesc VTDInvDesc;
   (0x3800ULL | ~(VTD_HAW_MASK(aw) | VTD_SL_IGN_COM |

VTD_SL_TM)) : \

   (0x3800ULL | ~(VTD_HAW_MASK(aw) | VTD_SL_IGN_COM))

+/* Masks for PIOTLB Invalidate Descriptor */
+#define VTD_INV_DESC_PIOTLB_G (3ULL << 4)
+#define VTD_INV_DESC_PIOTLB_ALL_IN_PASID  (2ULL << 4)
+#define VTD_INV_DESC_PIOTLB_PSI_IN_PASID  (3ULL << 4)
+#define VTD_INV_DESC_PIOTLB_DID(val)  (((val) >> 16) &

VTD_DOMAIN_ID_MASK)

+#define VTD_INV_DESC_PIOTLB_PASID(val)(((val) >> 32) & 0xfULL)
+#define VTD_INV_DESC_PIOTLB_RSVD_VAL0 0xfff0f1c0ULL
+#define VTD_INV_DESC_PIOTLB_RSVD_VAL1 0xf80ULL
+
   /* Information about page-selective IOTLB invalidate */
   struct VTDIOTLBPageInvInfo {
   uint16_t domain_id;
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index c1382a5651..df591419b7 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -2656,6 +2656,83 @@ static bool

vtd_process_iotlb_desc(IntelIOMMUState *s, VTDInvDesc *inv_desc)

   return true;
   }

+static gboolean vtd_hash_remove_by_pasid(gpointer key, gpointer value,
+ gpointer user_data)
+{
+VTDIOTLBEntry *entry = (VTDIOTLBEntry *)value;
+VTDIOTLBPageInvInfo *info = (VTDIOTLBPageInvInfo *)user_data;
+
+return ((entry->domain_id == info->domain_id) &&
+(entry->pasid == info->pasid));
+}
+
+static void vtd_piotlb_pasid_invalidate(IntelIOMMUState *s,
+uint16_t domain_id, uint32_t pasid)
+{
+VTDIOTLBPageInvInfo info;
+VTDAddressSpace *vtd_as;
+VTDContextEntry ce;
+
+info.domain_id = domain_id;
+info.pasid = pasid;
+
+vtd_iommu_lock(s);
+g_hash_table_foreach_remove(s->iotlb, vtd_hash_remove_by_pasid,
+&info);
+vtd_iommu_unlock(s);
+
+QLIST_FOREACH(vtd_as, &s->vtd_as_with_notifiers, next) {
+if (!vtd_dev_to_context_entry(s, pci_bus_num(vtd_as->bus),
+  vtd_as->devfn, &ce) &&
+domain_id == v

Re: [PATCH v4 3/6] device/virtio-nsm: Support for Nitro Secure Module device

2024-08-19 Thread Alexander Graf


On 18.08.24 13:42, Dorjoy Chowdhury wrote:

Nitro Secure Module (NSM)[1] device is used in AWS Nitro Enclaves for
stripped down TPM functionality like cryptographic attestation. The
requests to and responses from NSM device are CBOR[2] encoded.

This commit adds support for NSM device in QEMU. Although related to
AWS Nitro Enclaves, the virito-nsm device is independent and can be
used in other machine types as well. The libcbor[3] library has been
used for the CBOR encoding and decoding functionalities.

[1] https://lists.oasis-open.org/archives/virtio-comment/202310/msg00387.html
[2] http://cbor.io/
[3] https://libcbor.readthedocs.io/en/latest/

Signed-off-by: Dorjoy Chowdhury 



[...]



+static bool add_payload_to_cose(cbor_item_t *cose, VirtIONSM *vnsm)
+{
+cbor_item_t *root = NULL;
+cbor_item_t *nested_map;
+cbor_item_t *bs = NULL;
+size_t locked_cnt;
+uint8_t ind[NSM_MAX_PCRS];
+size_t payload_map_size = 6;
+size_t len;
+struct PCRInfo *pcr;
+uint8_t zero[64] = {0};
+bool r = false;
+size_t buf_len = 16384;
+uint8_t *buf = g_malloc(buf_len);
+
+if (vnsm->public_key_len > 0) {
+payload_map_size++;
+}
+if (vnsm->user_data_len > 0) {
+payload_map_size++;
+}
+if (vnsm->nonce_len > 0) {
+payload_map_size++;
+}



Now that you're always emitting user_data and nonce, you should include 
them in payload_map_size unconditionally as well; otherwise your map is 
too small to hold all members.


In addition, a real Nitro Enclave attestation document will return Null 
objects for these fields when they're not set instead of empty strings. 
With the patch below I was able to generate a doc that looks very 
similar to a real one:


diff --git a/hw/virtio/cbor-helpers.c b/hw/virtio/cbor-helpers.c
index 5140020d4e..ffecc97c48 100644
--- a/hw/virtio/cbor-helpers.c
+++ b/hw/virtio/cbor-helpers.c
@@ -140,7 +140,11 @@ bool qemu_cbor_add_bytestring_to_map(cbor_item_t 
*map, const char *key,

 if (!key_cbor) {
 goto cleanup;
 }
-    value_cbor = cbor_build_bytestring(arr, len);
+    if (len) {
+    value_cbor = cbor_build_bytestring(arr, len);
+    } else {
+    value_cbor = cbor_new_null();
+    }
 if (!value_cbor) {
 goto cleanup;
 }
@@ -241,7 +245,11 @@ bool 
qemu_cbor_add_uint8_key_bytestring_to_map(cbor_item_t *map, uint8_t key,

 if (!key_cbor) {
 goto cleanup;
 }
-    value_cbor = cbor_build_bytestring(buf, len);
+    if (len) {
+    value_cbor = cbor_build_bytestring(buf, len);
+    } else {
+    value_cbor = cbor_new_null();
+    }
 if (!value_cbor) {
 goto cleanup;
 }
diff --git a/hw/virtio/virtio-nsm.c b/hw/virtio/virtio-nsm.c
index e91848a2b0..b45d97efe2 100644
--- a/hw/virtio/virtio-nsm.c
+++ b/hw/virtio/virtio-nsm.c
@@ -1126,7 +1126,7 @@ static bool add_payload_to_cose(cbor_item_t *cose, 
VirtIONSM *vnsm)

 cbor_item_t *bs = NULL;
 size_t locked_cnt;
 uint8_t ind[NSM_MAX_PCRS];
-    size_t payload_map_size = 6;
+    size_t payload_map_size = 8;
 size_t len;
 struct PCRInfo *pcr;
 uint8_t zero[64] = {0};
@@ -1137,12 +1137,6 @@ static bool add_payload_to_cose(cbor_item_t 
*cose, VirtIONSM *vnsm)

 if (vnsm->public_key_len > 0) {
 payload_map_size++;
 }
-    if (vnsm->user_data_len > 0) {
-    payload_map_size++;
-    }
-    if (vnsm->nonce_len > 0) {
-    payload_map_size++;
-    }
 root = cbor_new_definite_map(payload_map_size);
 if (!root) {
 goto cleanup;


Alex




Amazon Web Services Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597


Re: [PATCH v2 16/17] intel_iommu: Introduce a property to control FS1GP cap bit setting

2024-08-19 Thread Yi Liu

On 2024/8/15 11:46, Duan, Zhenzhong wrote:




-Original Message-
From: Liu, Yi L 
Subject: Re: [PATCH v2 16/17] intel_iommu: Introduce a property to control
FS1GP cap bit setting

On 2024/8/5 14:27, Zhenzhong Duan wrote:

When host IOMMU doesn't support FS1GP but vIOMMU does, host

IOMMU

can't translate stage-1 page table from guest correctly.


this series is for emulated devices, so the above statement does not
belong to this series. Is there any other reason to have this option?


Good catch, will remove this comment.
In fact, this patch is mainly for passthrough device where host IOMMU doesn't 
support fs1gp.


I see. To me, as long as the vIOMMU page walk logic supports 1GP large
pages, it's ok to report the FS1GP cap to VM. But it is still fine to
have this property to opt-out FS1GP if admin/orchestration layer(e.g. libvirt)
knows no hw iommu has this capability, so it is better to opt out it
before invoking QEMU.

Is this your motivation for this property?




Add a property x-cap-fs1gp for user to turn FS1GP off so that
nested page table on host side works.


I guess you would need to sync the FS1GP cap with host before reporting it
in vIOMMU when comes to support passthrough devices.


Yes, we already have this check, see 
https://github.com/yiliu1765/qemu/commit/b7ac7ce3a2e21eb1b3172743ee6f73e80fe67b3a


good to know it. :) Will you fail the VM if the device's iommu does not
support FS1GP or just mask out the FS1GP?

--
Regards,
Yi Liu



Re: [PATCH v2 13/17] intel_iommu: piotlb invalidation should notify unmap

2024-08-19 Thread Yi Liu

On 2024/8/5 14:27, Zhenzhong Duan wrote:

This is used by some emulated devices which caches address
translation result. When piotlb invalidation issued in guest,
those caches should be refreshed.


Perhaps I have asked it in the before. :) To me, such emulated devices
should implement an ATS-capability. You may mention the devices that
does not implement ATS-capability, but caches the translation result,
and note that it is better to implement ATS cap if there is need to
cache the translation request.



Signed-off-by: Yi Sun 
Signed-off-by: Zhenzhong Duan 
Reviewed-by: Clément Mathieu--Drif
---
  hw/i386/intel_iommu.c | 35 ++-
  1 file changed, 34 insertions(+), 1 deletion(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index fa00f85fd7..317e630e08 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -2907,7 +2907,7 @@ static void vtd_piotlb_pasid_invalidate(IntelIOMMUState 
*s,
  continue;
  }
  
-if (!s->scalable_modern) {

+if (!s->scalable_modern || !vtd_as_has_map_notifier(vtd_as)) {
  vtd_address_space_sync(vtd_as);
  }
  }
@@ -2919,6 +2919,9 @@ static void vtd_piotlb_page_invalidate(IntelIOMMUState 
*s, uint16_t domain_id,
 bool ih)
  {
  VTDIOTLBPageInvInfo info;
+VTDAddressSpace *vtd_as;
+VTDContextEntry ce;
+hwaddr size = (1 << am) * VTD_PAGE_SIZE;
  
  info.domain_id = domain_id;

  info.pasid = pasid;
@@ -2929,6 +2932,36 @@ static void vtd_piotlb_page_invalidate(IntelIOMMUState 
*s, uint16_t domain_id,
  g_hash_table_foreach_remove(s->iotlb,
  vtd_hash_remove_by_page_piotlb, &info);
  vtd_iommu_unlock(s);
+
+QLIST_FOREACH(vtd_as, &s->vtd_as_with_notifiers, next) {
+if (!vtd_dev_to_context_entry(s, pci_bus_num(vtd_as->bus),
+  vtd_as->devfn, &ce) &&
+domain_id == vtd_get_domain_id(s, &ce, vtd_as->pasid)) {
+uint32_t rid2pasid = VTD_CE_GET_RID2PASID(&ce);
+IOMMUTLBEvent event;
+
+if ((vtd_as->pasid != PCI_NO_PASID || pasid != rid2pasid) &&
+vtd_as->pasid != pasid) {
+continue;
+}
+
+/*
+ * Page-Selective-within-PASID PASID-based-IOTLB Invalidation
+ * does not flush stage-2 entries. See spec section 6.5.2.4
+ */
+if (!s->scalable_modern) {
+continue;
+}
+
+event.type = IOMMU_NOTIFIER_UNMAP;
+event.entry.target_as = &address_space_memory;
+event.entry.iova = addr;
+event.entry.perm = IOMMU_NONE;
+event.entry.addr_mask = size - 1;
+event.entry.translated_addr = 0;
+memory_region_notify_iommu(&vtd_as->iommu, 0, event);
+}
+}
  }
  
  static bool vtd_process_piotlb_desc(IntelIOMMUState *s,


--
Regards,
Yi Liu



Re: [PATCH 4/4] hw/arm/sbsa-ref: Use two-stage SMMU

2024-08-19 Thread Marcin Juszkiewicz

W dniu 16.08.2024 o 18:13, Peter Maydell pisze:

Now that our SMMU model supports enabling both stages of translation
at once, we can enable this in the sbsa-ref board.  Existing guest
code that only programs stage 1 and doesn't care about stage 2 should
continue to run with the same behaviour, but guests that do want to
do nested SMMU configurations can now do so.

Signed-off-by: Peter Maydell 


Yay! Another step in getting (S)BSA ACS pass done:

Operating System View:
 304 : Check SMMU S-EL2 & stage1 support  : Result:  PASS
Hypervisor View:
 352 : Check SMMU S-EL2 & stage2 support  : Result:  PASS

Reviewed-by: Marcin Juszkiewicz 


---
  hw/arm/sbsa-ref.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
index ae37a923015..396abe9c1bd 100644
--- a/hw/arm/sbsa-ref.c
+++ b/hw/arm/sbsa-ref.c
@@ -621,6 +621,7 @@ static void create_smmu(const SBSAMachineState *sms, PCIBus 
*bus)
  
  dev = qdev_new(TYPE_ARM_SMMUV3);
  
+object_property_set_str(OBJECT(dev), "stage", "nested", &error_abort);

  object_property_set_link(OBJECT(dev), "primary-bus", OBJECT(bus),
   &error_abort);
  sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);





RE: [PATCH v2 16/17] intel_iommu: Introduce a property to control FS1GP cap bit setting

2024-08-19 Thread Duan, Zhenzhong


>-Original Message-
>From: Liu, Yi L 
>Subject: Re: [PATCH v2 16/17] intel_iommu: Introduce a property to control
>FS1GP cap bit setting
>
>On 2024/8/15 11:46, Duan, Zhenzhong wrote:
>>
>>
>>> -Original Message-
>>> From: Liu, Yi L 
>>> Subject: Re: [PATCH v2 16/17] intel_iommu: Introduce a property to
>control
>>> FS1GP cap bit setting
>>>
>>> On 2024/8/5 14:27, Zhenzhong Duan wrote:
 When host IOMMU doesn't support FS1GP but vIOMMU does, host
>>> IOMMU
 can't translate stage-1 page table from guest correctly.
>>>
>>> this series is for emulated devices, so the above statement does not
>>> belong to this series. Is there any other reason to have this option?
>>
>> Good catch, will remove this comment.
>> In fact, this patch is mainly for passthrough device where host IOMMU
>doesn't support fs1gp.
>
>I see. To me, as long as the vIOMMU page walk logic supports 1GP large
>pages, it's ok to report the FS1GP cap to VM. But it is still fine to
>have this property to opt-out FS1GP if admin/orchestration layer(e.g. libvirt)
>knows no hw iommu has this capability, so it is better to opt out it
>before invoking QEMU.
>
>Is this your motivation for this property?

Exactly.

>
>>>
 Add a property x-cap-fs1gp for user to turn FS1GP off so that
 nested page table on host side works.
>>>
>>> I guess you would need to sync the FS1GP cap with host before reporting
>it
>>> in vIOMMU when comes to support passthrough devices.
>>
>> Yes, we already have this check, see
>https://github.com/yiliu1765/qemu/commit/b7ac7ce3a2e21eb1b3172743
>ee6f73e80fe67b3a
>
>good to know it. :) Will you fail the VM if the device's iommu does not
>support FS1GP or just mask out the FS1GP?

For cold plugged VFIO device, it will fail the VM with "Stage-1 1GB huge page 
is unsupported by host IOMMU" error report.
For hotplug VFIO device, only hotplug fails with "Stage-1 1GB huge page is 
unsupported by host IOMMU".

We don't update vIOMMU cap/ecap from host cap/ecap per Michael's suggestion, 
only vIOMMU properties can control them.

Thanks
Zhenzhong


RE: [PATCH v2 13/17] intel_iommu: piotlb invalidation should notify unmap

2024-08-19 Thread Duan, Zhenzhong


>-Original Message-
>From: Liu, Yi L 
>Subject: Re: [PATCH v2 13/17] intel_iommu: piotlb invalidation should
>notify unmap
>
>On 2024/8/5 14:27, Zhenzhong Duan wrote:
>> This is used by some emulated devices which caches address
>> translation result. When piotlb invalidation issued in guest,
>> those caches should be refreshed.
>
>Perhaps I have asked it in the before. :) To me, such emulated devices
>should implement an ATS-capability. You may mention the devices that
>does not implement ATS-capability, but caches the translation result,
>and note that it is better to implement ATS cap if there is need to
>cache the translation request.

OK, will do. Will be like:

"For device that does not implement ATS-capability or disable it
but still caches the translation result, it is better to implement ATS cap
or enable it if there is need to cache the translation request."

Thanks
Zhenzhong

>
>>
>> Signed-off-by: Yi Sun 
>> Signed-off-by: Zhenzhong Duan 
>> Reviewed-by: Clément Mathieu--Drif
>> ---
>>   hw/i386/intel_iommu.c | 35
>++-
>>   1 file changed, 34 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>> index fa00f85fd7..317e630e08 100644
>> --- a/hw/i386/intel_iommu.c
>> +++ b/hw/i386/intel_iommu.c
>> @@ -2907,7 +2907,7 @@ static void
>vtd_piotlb_pasid_invalidate(IntelIOMMUState *s,
>>   continue;
>>   }
>>
>> -if (!s->scalable_modern) {
>> +if (!s->scalable_modern || !vtd_as_has_map_notifier(vtd_as)) {
>>   vtd_address_space_sync(vtd_as);
>>   }
>>   }
>> @@ -2919,6 +2919,9 @@ static void
>vtd_piotlb_page_invalidate(IntelIOMMUState *s, uint16_t domain_id,
>>  bool ih)
>>   {
>>   VTDIOTLBPageInvInfo info;
>> +VTDAddressSpace *vtd_as;
>> +VTDContextEntry ce;
>> +hwaddr size = (1 << am) * VTD_PAGE_SIZE;
>>
>>   info.domain_id = domain_id;
>>   info.pasid = pasid;
>> @@ -2929,6 +2932,36 @@ static void
>vtd_piotlb_page_invalidate(IntelIOMMUState *s, uint16_t domain_id,
>>   g_hash_table_foreach_remove(s->iotlb,
>>   vtd_hash_remove_by_page_piotlb, &info);
>>   vtd_iommu_unlock(s);
>> +
>> +QLIST_FOREACH(vtd_as, &s->vtd_as_with_notifiers, next) {
>> +if (!vtd_dev_to_context_entry(s, pci_bus_num(vtd_as->bus),
>> +  vtd_as->devfn, &ce) &&
>> +domain_id == vtd_get_domain_id(s, &ce, vtd_as->pasid)) {
>> +uint32_t rid2pasid = VTD_CE_GET_RID2PASID(&ce);
>> +IOMMUTLBEvent event;
>> +
>> +if ((vtd_as->pasid != PCI_NO_PASID || pasid != rid2pasid) &&
>> +vtd_as->pasid != pasid) {
>> +continue;
>> +}
>> +
>> +/*
>> + * Page-Selective-within-PASID PASID-based-IOTLB Invalidation
>> + * does not flush stage-2 entries. See spec section 6.5.2.4
>> + */
>> +if (!s->scalable_modern) {
>> +continue;
>> +}
>> +
>> +event.type = IOMMU_NOTIFIER_UNMAP;
>> +event.entry.target_as = &address_space_memory;
>> +event.entry.iova = addr;
>> +event.entry.perm = IOMMU_NONE;
>> +event.entry.addr_mask = size - 1;
>> +event.entry.translated_addr = 0;
>> +memory_region_notify_iommu(&vtd_as->iommu, 0, event);
>> +}
>> +}
>>   }
>>
>>   static bool vtd_process_piotlb_desc(IntelIOMMUState *s,
>
>--
>Regards,
>Yi Liu


Re: [PATCH 0/4] hw/arm: Enable 'nested' SMMU in virt, sbsa-ref

2024-08-19 Thread Eric Auger
Hi Peter,

On 8/16/24 18:13, Peter Maydell wrote:
> This patchset enables support for nested (two stage) translations
> in the SMMU in the virt and sbsa-ref boards.
>
> Patch 1 is Cornelia's compat-machine machinery patch, which we
> need to make this change only happen for virt-9.2 and later;
> patch 2 is a trivial "missing comment update" change; patches
> 3 and 4 are the board changes.
>
> Enabling nested support should be transparent to guests, which
> will only enable stage 2 if they actually want it.
>
> thanks
> -- PMM

For the whole series:

Reviewed-by: Eric Auger 

Thanks

Eric

>
> Cornelia Huck (1):
>   hw: add compat machines for 9.2
>
> Peter Maydell (3):
>   hw/arm/smmuv3: Update comment documenting "stage" property
>   hw/arm/virt: Default to two-stage SMMU from virt-9.2
>   hw/arm/sbsa-ref: Use two-stage SMMU
>
>  include/hw/arm/virt.h  |  1 +
>  include/hw/boards.h|  3 +++
>  include/hw/i386/pc.h   |  3 +++
>  hw/arm/sbsa-ref.c  |  1 +
>  hw/arm/smmuv3.c|  1 +
>  hw/arm/virt.c  | 19 +--
>  hw/core/machine.c  |  3 +++
>  hw/i386/pc.c   |  3 +++
>  hw/i386/pc_piix.c  | 15 ---
>  hw/i386/pc_q35.c   | 13 +++--
>  hw/m68k/virt.c | 11 +--
>  hw/ppc/spapr.c | 17 ++---
>  hw/s390x/s390-virtio-ccw.c | 14 +-
>  13 files changed, 91 insertions(+), 13 deletions(-)
>




Re: [PATCH v4 4/6] machine/nitro-enclave: Add built-in Nitro Secure Module device

2024-08-19 Thread Alexander Graf

Hey Dorjoy,

On 18.08.24 13:42, Dorjoy Chowdhury wrote:

AWS Nitro Enclaves have built-in Nitro Secure Module (NSM) device which
is used for stripped down TPM functionality like attestation. This commit
adds the built-in NSM device in the nitro-enclave machine type.

In Nitro Enclaves, all the PCRs start in a known zero state and the first
16 PCRs are locked from boot and reserved. The PCR0, PCR1, PCR2 and PCR8
contain the SHA384 hashes related to the EIF file used to boot the
VM for validation.

Some optional nitro-enclave machine options have been added:
 - 'id': Enclave identifier, reflected in the module-id of the NSM
device. If not provided, a default id will be set.
 - 'parent-role': Parent instance IAM role ARN, reflected in PCR3
of the NSM device.
 - 'parent-id': Parent instance identifier, reflected in PCR4 of the
NSM device.

Signed-off-by: Dorjoy Chowdhury 
---
  crypto/meson.build  |   2 +-
  crypto/x509-utils.c |  73 +++



Can you please put this new API into its own patch file?



  hw/core/eif.c   | 225 +---
  hw/core/eif.h   |   5 +-



These changes to eif.c should ideally already be part of the patch that 
introduces eif.c (patch 1), no? In fact, do you think you can make the 
whole eif logic its own patch file?




  hw/core/meson.build |   4 +-
  hw/i386/Kconfig |   1 +
  hw/i386/nitro_enclave.c | 141 +++-
  include/crypto/x509-utils.h |  22 
  include/hw/i386/nitro_enclave.h |  26 
  9 files changed, 479 insertions(+), 20 deletions(-)
  create mode 100644 crypto/x509-utils.c
  create mode 100644 include/crypto/x509-utils.h

diff --git a/crypto/meson.build b/crypto/meson.build
index c46f9c22a7..09633194ed 100644
--- a/crypto/meson.build
+++ b/crypto/meson.build
@@ -62,7 +62,7 @@ endif
  if gcrypt.found()
util_ss.add(gcrypt, files('random-gcrypt.c'))
  elif gnutls.found()
-  util_ss.add(gnutls, files('random-gnutls.c'))
+  util_ss.add(gnutls, files('random-gnutls.c', 'x509-utils.c'))



What if we don't have gnutls. Will everything still compile or do we 
need to add any dependencies?




  elif get_option('rng_none')
util_ss.add(files('random-none.c'))
  else
diff --git a/crypto/x509-utils.c b/crypto/x509-utils.c
new file mode 100644
index 00..2422eb995c
--- /dev/null
+++ b/crypto/x509-utils.c
@@ -0,0 +1,73 @@
+/*
+ * X.509 certificate related helpers
+ *
+ * Copyright (c) 2024 Dorjoy Chowdhury 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * (at your option) any later version.  See the COPYING file in the
+ * top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "crypto/x509-utils.h"
+#include 
+#include 
+#include 
+
+static int qcrypto_to_gnutls_hash_alg_map[QCRYPTO_HASH_ALG__MAX] = {
+[QCRYPTO_HASH_ALG_MD5] = GNUTLS_DIG_MD5,
+[QCRYPTO_HASH_ALG_SHA1] = GNUTLS_DIG_SHA1,
+[QCRYPTO_HASH_ALG_SHA224] = GNUTLS_DIG_SHA224,
+[QCRYPTO_HASH_ALG_SHA256] = GNUTLS_DIG_SHA256,
+[QCRYPTO_HASH_ALG_SHA384] = GNUTLS_DIG_SHA384,
+[QCRYPTO_HASH_ALG_SHA512] = GNUTLS_DIG_SHA512,
+[QCRYPTO_HASH_ALG_RIPEMD160] = GNUTLS_DIG_RMD160,
+};
+
+int qcrypto_get_x509_cert_fingerprint(uint8_t *cert, size_t size,
+  QCryptoHashAlgorithm alg,
+  uint8_t **result,
+  size_t *resultlen,
+  Error **errp)
+{
+int ret;
+gnutls_x509_crt_t crt;
+gnutls_datum_t datum = {.data = cert, .size = size};
+
+if (alg >= G_N_ELEMENTS(qcrypto_to_gnutls_hash_alg_map)) {
+error_setg(errp, "Unknown hash algorithm");
+return -1;
+}
+
+gnutls_x509_crt_init(&crt);
+
+if (gnutls_x509_crt_import(crt, &datum, GNUTLS_X509_FMT_PEM) != 0) {
+error_setg(errp, "Failed to import certificate");
+goto cleanup;
+}
+
+ret = gnutls_hash_get_len(qcrypto_to_gnutls_hash_alg_map[alg]);
+if (*resultlen == 0) {
+*resultlen = ret;
+*result = g_new0(uint8_t, *resultlen);
+} else if (*resultlen < ret) {
+error_setg(errp,
+   "Result buffer size %zu is smaller than hash %d",
+   *resultlen, ret);
+goto cleanup;
+}
+
+if (gnutls_x509_crt_get_fingerprint(crt,
+qcrypto_to_gnutls_hash_alg_map[alg],
+*result, resultlen) != 0) {
+error_setg(errp, "Failed to get fingerprint from certificate");
+goto cleanup;
+}
+
+return 0;
+
+ cleanup:
+gnutls_x509_crt_deinit(crt);
+return -1;
+}
diff --git a/hw/core/eif.c b/hw/core/eif.c
index 5558879a96..8e15142d36 100644
--- a/hw/core/eif.c
+++ b/hw/core/eif.c
@@ -11,7 +11,10 @@
  #include "qemu/osdep.h"
  #include "qemu/bswap.h"
  #include "qapi/error.h"
+#include 

Re: [PATCH v4 5/6] crypto: Support SHA384 hash when using glib

2024-08-19 Thread Daniel P . Berrangé
On Sun, Aug 18, 2024 at 05:42:56PM +0600, Dorjoy Chowdhury wrote:
> QEMU requires minimum glib version 2.66.0 as per the root meson.build
> file and per glib documentation[1] G_CHECKSUM_SHA384 is available since
> 2.51.
> 
> [1] https://docs.gtk.org/glib/enum.ChecksumType.html
> 
> Signed-off-by: Dorjoy Chowdhury 
> ---
>  crypto/hash-glib.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Daniel P. Berrangé 

> diff --git a/crypto/hash-glib.c b/crypto/hash-glib.c
> index 82de9db705..18e64faa9c 100644
> --- a/crypto/hash-glib.c
> +++ b/crypto/hash-glib.c
> @@ -29,7 +29,7 @@ static int qcrypto_hash_alg_map[QCRYPTO_HASH_ALG__MAX] = {
>  [QCRYPTO_HASH_ALG_SHA1] = G_CHECKSUM_SHA1,
>  [QCRYPTO_HASH_ALG_SHA224] = -1,
>  [QCRYPTO_HASH_ALG_SHA256] = G_CHECKSUM_SHA256,
> -[QCRYPTO_HASH_ALG_SHA384] = -1,
> +[QCRYPTO_HASH_ALG_SHA384] = G_CHECKSUM_SHA384,
>  [QCRYPTO_HASH_ALG_SHA512] = G_CHECKSUM_SHA512,
>  [QCRYPTO_HASH_ALG_RIPEMD160] = -1,
>  };

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH] .gitlab-ci.d/windows.yml: Disable the qtests in the MSYS2 job

2024-08-19 Thread Philippe Mathieu-Daudé

On 19/8/24 07:30, Thomas Huth wrote:

On 16/08/2024 19.18, Philippe Mathieu-Daudé wrote:

On 16/8/24 18:40, Thomas Huth wrote:

On 16/08/2024 18.34, Philippe Mathieu-Daudé wrote:

On 16/8/24 17:37, Thomas Huth wrote:

The qtests are broken since a while in the MSYS2 job in the gitlab-CI,
likely due to some changes in the MSYS2 environment. So far nobody has
neither a clue what's going wrong here, nor an idea how to fix this
(in fact most QEMU developers even don't have a Windows environment
available for properly analyzing this problem), so let's disable the
qtests here again to get at least the test coverage for the 
compilation

and unit tests back to the CI.

Signed-off-by: Thomas Huth 
---
  .gitlab-ci.d/windows.yml | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/.gitlab-ci.d/windows.yml b/.gitlab-ci.d/windows.yml
index a83f23a786..9f3112f010 100644
--- a/.gitlab-ci.d/windows.yml
+++ b/.gitlab-ci.d/windows.yml
@@ -23,6 +23,8 @@ msys2-64bit:
  # for this job, because otherwise the build could not 
complete within

  # the project timeout.
  CONFIGURE_ARGS:  --target-list=sparc-softmmu 
--without-default-devices -Ddebug=false -Doptimization=0
+    # The qtests are broken in the msys2 job on gitlab, so disable 
them:

+    TEST_ARGS: --no-suite qtest


Then building system emulation is pointless, isn't it?


We're still running the unit tests and some others.


I tried to configure with '--disable-system' and the same tests
are run


... but you lose *compile-testing* of all of the system files, so what's 
your point? ... sorry, I don't get it?


I'm wondering why wasting resources and time on our longest job
if the produced binary doesn't run. Anyway, I'm not objecting to
your patch.



Re: [PATCH v4 4/6] machine/nitro-enclave: Add built-in Nitro Secure Module device

2024-08-19 Thread Daniel P . Berrangé
On Sun, Aug 18, 2024 at 05:42:55PM +0600, Dorjoy Chowdhury wrote:
> AWS Nitro Enclaves have built-in Nitro Secure Module (NSM) device which
> is used for stripped down TPM functionality like attestation. This commit
> adds the built-in NSM device in the nitro-enclave machine type.
> 
> In Nitro Enclaves, all the PCRs start in a known zero state and the first
> 16 PCRs are locked from boot and reserved. The PCR0, PCR1, PCR2 and PCR8
> contain the SHA384 hashes related to the EIF file used to boot the
> VM for validation.
> 
> Some optional nitro-enclave machine options have been added:
> - 'id': Enclave identifier, reflected in the module-id of the NSM
> device. If not provided, a default id will be set.
> - 'parent-role': Parent instance IAM role ARN, reflected in PCR3
> of the NSM device.
> - 'parent-id': Parent instance identifier, reflected in PCR4 of the
> NSM device.
> 
> Signed-off-by: Dorjoy Chowdhury 
> ---
>  crypto/meson.build  |   2 +-
>  crypto/x509-utils.c |  73 +++
>  include/crypto/x509-utils.h |  22 

Preferrably add these 3 in a standlone commit, since its is good practice
to separate commits adding infra, from commits adding usage of infra.

>  hw/core/eif.c   | 225 +---
>  hw/core/eif.h   |   5 +-
>  hw/core/meson.build |   4 +-
>  hw/i386/Kconfig |   1 +
>  hw/i386/nitro_enclave.c | 141 +++-
>  include/hw/i386/nitro_enclave.h |  26 
>  9 files changed, 479 insertions(+), 20 deletions(-)
>  create mode 100644 crypto/x509-utils.c
>  create mode 100644 include/crypto/x509-utils.h
> 
> diff --git a/crypto/meson.build b/crypto/meson.build
> index c46f9c22a7..09633194ed 100644
> --- a/crypto/meson.build
> +++ b/crypto/meson.build
> @@ -62,7 +62,7 @@ endif
>  if gcrypt.found()
>util_ss.add(gcrypt, files('random-gcrypt.c'))
>  elif gnutls.found()
> -  util_ss.add(gnutls, files('random-gnutls.c'))
> +  util_ss.add(gnutls, files('random-gnutls.c', 'x509-utils.c'))
>  elif get_option('rng_none')
>util_ss.add(files('random-none.c'))
>  else

This logic block is handling preferences for the RNG impl.

We want to be compiling x509-utils.c *anytime* gnutls is
found, regardless of what we prioritize for RNG backend.
Also it should be added to crypto_ss, not util_ss.

So put this as its own standalone block

  if gnutls.found()
crypto_ss.add(files('x509-utils.c'))
  endif

> diff --git a/crypto/x509-utils.c b/crypto/x509-utils.c
> new file mode 100644
> index 00..2422eb995c
> --- /dev/null
> +++ b/crypto/x509-utils.c
> @@ -0,0 +1,73 @@
> +/*
> + * X.509 certificate related helpers
> + *
> + * Copyright (c) 2024 Dorjoy Chowdhury 
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or
> + * (at your option) any later version.  See the COPYING file in the
> + * top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qapi/error.h"
> +#include "crypto/x509-utils.h"
> +#include 
> +#include 
> +#include 
> +
> +static int qcrypto_to_gnutls_hash_alg_map[QCRYPTO_HASH_ALG__MAX] = {

Can make this 'const' too

> +[QCRYPTO_HASH_ALG_MD5] = GNUTLS_DIG_MD5,
> +[QCRYPTO_HASH_ALG_SHA1] = GNUTLS_DIG_SHA1,
> +[QCRYPTO_HASH_ALG_SHA224] = GNUTLS_DIG_SHA224,
> +[QCRYPTO_HASH_ALG_SHA256] = GNUTLS_DIG_SHA256,
> +[QCRYPTO_HASH_ALG_SHA384] = GNUTLS_DIG_SHA384,
> +[QCRYPTO_HASH_ALG_SHA512] = GNUTLS_DIG_SHA512,
> +[QCRYPTO_HASH_ALG_RIPEMD160] = GNUTLS_DIG_RMD160,
> +};
> +
> +int qcrypto_get_x509_cert_fingerprint(uint8_t *cert, size_t size,
> +  QCryptoHashAlgorithm alg,
> +  uint8_t **result,
> +  size_t *resultlen,
> +  Error **errp)
> +{
> +int ret;
> +gnutls_x509_crt_t crt;
> +gnutls_datum_t datum = {.data = cert, .size = size};

Assign '*result = NULL' &&  '*resultlen = 0' here at the start, so we
have clear semantics on failure.


> +
> +if (alg >= G_N_ELEMENTS(qcrypto_to_gnutls_hash_alg_map)) {
> +error_setg(errp, "Unknown hash algorithm");
> +return -1;
> +}
> +
> +gnutls_x509_crt_init(&crt);
> +
> +if (gnutls_x509_crt_import(crt, &datum, GNUTLS_X509_FMT_PEM) != 0) {
> +error_setg(errp, "Failed to import certificate");
> +goto cleanup;
> +}
> +
> +ret = gnutls_hash_get_len(qcrypto_to_gnutls_hash_alg_map[alg]);
> +if (*resultlen == 0) {
> +*resultlen = ret;
> +*result = g_new0(uint8_t, *resultlen);
> +} else if (*resultlen < ret) {
> +error_setg(errp,
> +   "Result buffer size %zu is smaller than hash %d",
> +   *resultlen, ret);
> +goto cleanup;
> +}
> +
> +if (gnutls_x509_crt_get_fingerprint(crt,
> +qcrypto_to_gnutls_hash_alg_map[alg],
> + 

Re: [PATCH] .gitlab-ci.d/windows.yml: Disable the qtests in the MSYS2 job

2024-08-19 Thread Thomas Huth

On 19/08/2024 12.21, Philippe Mathieu-Daudé wrote:

On 19/8/24 07:30, Thomas Huth wrote:

On 16/08/2024 19.18, Philippe Mathieu-Daudé wrote:

On 16/8/24 18:40, Thomas Huth wrote:

On 16/08/2024 18.34, Philippe Mathieu-Daudé wrote:

On 16/8/24 17:37, Thomas Huth wrote:

The qtests are broken since a while in the MSYS2 job in the gitlab-CI,
likely due to some changes in the MSYS2 environment. So far nobody has
neither a clue what's going wrong here, nor an idea how to fix this
(in fact most QEMU developers even don't have a Windows environment
available for properly analyzing this problem), so let's disable the
qtests here again to get at least the test coverage for the compilation
and unit tests back to the CI.

Signed-off-by: Thomas Huth 
---
  .gitlab-ci.d/windows.yml | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/.gitlab-ci.d/windows.yml b/.gitlab-ci.d/windows.yml
index a83f23a786..9f3112f010 100644
--- a/.gitlab-ci.d/windows.yml
+++ b/.gitlab-ci.d/windows.yml
@@ -23,6 +23,8 @@ msys2-64bit:
  # for this job, because otherwise the build could not complete 
within

  # the project timeout.
  CONFIGURE_ARGS:  --target-list=sparc-softmmu 
--without-default-devices -Ddebug=false -Doptimization=0

+    # The qtests are broken in the msys2 job on gitlab, so disable them:
+    TEST_ARGS: --no-suite qtest


Then building system emulation is pointless, isn't it?


We're still running the unit tests and some others.


I tried to configure with '--disable-system' and the same tests
are run


... but you lose *compile-testing* of all of the system files, so what's 
your point? ... sorry, I don't get it?


I'm wondering why wasting resources and time on our longest job
if the produced binary doesn't run. Anyway, I'm not objecting to
your patch.


Ah, ok, I missed that idea about one of the longest running jobs. Hmmm, 
considering that we compile test the code in the cross-win64-system test, 
too, we could indeed shorten the runtime here a little bit ... I'll give it 
a try and send a v2...


 Thomas




Re: [PATCH v4 3/6] device/virtio-nsm: Support for Nitro Secure Module device

2024-08-19 Thread Daniel P . Berrangé
On Sun, Aug 18, 2024 at 05:42:54PM +0600, Dorjoy Chowdhury wrote:
> Nitro Secure Module (NSM)[1] device is used in AWS Nitro Enclaves for
> stripped down TPM functionality like cryptographic attestation. The
> requests to and responses from NSM device are CBOR[2] encoded.
> 
> This commit adds support for NSM device in QEMU. Although related to
> AWS Nitro Enclaves, the virito-nsm device is independent and can be
> used in other machine types as well. The libcbor[3] library has been
> used for the CBOR encoding and decoding functionalities.
> 
> [1] https://lists.oasis-open.org/archives/virtio-comment/202310/msg00387.html
> [2] http://cbor.io/
> [3] https://libcbor.readthedocs.io/en/latest/
> 
> Signed-off-by: Dorjoy Chowdhury 
> ---
>  MAINTAINERS  |   10 +
>  hw/virtio/Kconfig|5 +
>  hw/virtio/cbor-helpers.c |  292 ++
>  hw/virtio/meson.build|4 +
>  hw/virtio/virtio-nsm-pci.c   |   73 ++
>  hw/virtio/virtio-nsm.c   | 1648 ++
>  include/hw/virtio/cbor-helpers.h |   43 +
>  include/hw/virtio/virtio-nsm.h   |   59 ++
>  8 files changed, 2134 insertions(+)
>  create mode 100644 hw/virtio/cbor-helpers.c
>  create mode 100644 hw/virtio/virtio-nsm-pci.c
>  create mode 100644 hw/virtio/virtio-nsm.c
>  create mode 100644 include/hw/virtio/cbor-helpers.h
>  create mode 100644 include/hw/virtio/virtio-nsm.h
> 

> diff --git a/hw/virtio/meson.build b/hw/virtio/meson.build
> index 621fc65454..7ccb61cf74 100644
> --- a/hw/virtio/meson.build
> +++ b/hw/virtio/meson.build
> @@ -48,12 +48,15 @@ else
>system_virtio_ss.add(files('vhost-stub.c'))
>  endif
>  
> +libcbor = dependency('libcbor', version: '>=0.7.0')

In the following patch I suggest that we should have this at the top level
meson.build. Now that I see you're referencing this twice in different
places, we definitely want it in the top meson.build.

ALso the CI changes in Mention in the following patch should be in whatever
patch first introduces the libcbor dependency.



> diff --git a/hw/virtio/virtio-nsm.c b/hw/virtio/virtio-nsm.c
> new file mode 100644
> index 00..e91848a2b0
> --- /dev/null
> +++ b/hw/virtio/virtio-nsm.c

> +static bool add_protected_header_to_cose(cbor_item_t *cose)
> +{
> +cbor_item_t *map = NULL;
> +cbor_item_t *key = NULL;
> +cbor_item_t *value = NULL;
> +cbor_item_t *bs = NULL;
> +size_t len;
> +bool r = false;
> +size_t buf_len = 4096;
> +uint8_t *buf = g_malloc(buf_len);

g_autofree avoids the manual  g_free call.

> +
> +map = cbor_new_definite_map(1);
> +if (!map) {
> +goto cleanup;
> +}
> +key = cbor_build_uint8(1);
> +if (!key) {
> +goto cleanup;
> +}
> +value = cbor_new_int8();
> +if (!value) {
> +goto cleanup;
> +}
> +cbor_mark_negint(value);
> +/* we don't actually sign the data, so we use -1 as the 'alg' value */
> +cbor_set_uint8(value, 0);
> +
> +if (!qemu_cbor_map_add(map, key, value)) {
> +goto cleanup;
> +}
> +
> +len = cbor_serialize(map, buf, buf_len);
> +if (len == 0) {
> +goto cleanup_map;
> +}
> +
> +bs = cbor_build_bytestring(buf, len);
> +if (!bs) {
> +goto cleanup_map;
> +}
> +if (!qemu_cbor_array_push(cose, bs)) {
> +cbor_decref(&bs);
> +goto cleanup_map;
> +}
> +r = true;
> +goto cleanup_map;
> +
> + cleanup:
> +if (key) {
> +cbor_decref(&key);
> +}
> +if (value) {
> +cbor_decref(&value);
> +}
> +
> + cleanup_map:
> +if (map) {
> +cbor_decref(&map);
> +}
> +g_free(buf);
> +return r;
> +}
> +



> +static bool handle_Attestation(VirtIONSM *vnsm, struct iovec *request,
> +   struct iovec *response, Error **errp)
> +{
> +cbor_item_t *root = NULL;
> +cbor_item_t *cose = NULL;
> +cbor_item_t *nested_map;
> +size_t len;
> +NSMAttestationReq nsm_req;
> +enum NSMResponseTypes type;
> +bool r = false;
> +size_t buf_len = 16384;
> +uint8_t *buf = g_malloc(buf_len);

Another suitable for g_autofree

> +
> +type = get_nsm_attestation_req(request->iov_base, request->iov_len,
> +   &nsm_req);
> +if (type != NSM_SUCCESS) {
> +if (error_response(response, type, errp)) {
> +r = true;
> +}
> +goto out;
> +}
> +
> +cose = cbor_new_definite_array(4);
> +if (!cose) {
> +goto err;
> +}
> +if (!add_protected_header_to_cose(cose)) {
> +goto err;
> +}
> +if (!add_unprotected_header_to_cose(cose)) {
> +goto err;
> +}
> +
> +if (nsm_req.public_key_len > 0) {
> +memcpy(vnsm->public_key, nsm_req.public_key, nsm_req.public_key_len);
> +vnsm->public_key_len = nsm_req.public_key_len;
> +}
> +if (nsm_req.user_data_len > 0) {
> +memcpy(vnsm->user_data, nsm

Re: [PATCH v8 01/13] acpi/generic_event_device: add an APEI error device

2024-08-19 Thread Igor Mammedov
On Fri, 16 Aug 2024 09:37:33 +0200
Mauro Carvalho Chehab  wrote:

> Adds a generic error device to handle generic hardware error
> events as specified at ACPI 6.5 specification at 18.3.2.7.2:
> https://uefi.org/specs/ACPI/6.5/18_Platform_Error_Interfaces.html#event-notification-for-generic-error-sources
make it match comment in the code for consistency
(i.e reference to 5.0b spec incl chapter/name)

> using HID PNP0C33.
> 
> The PNP0C33 device is used to report hardware errors to
> the guest via ACPI APEI Generic Hardware Error Source (GHES).
> 
> Co-authored-by: Mauro Carvalho Chehab 
> Co-authored-by: Jonathan Cameron 
> Signed-off-by: Jonathan Cameron 
> Signed-off-by: Mauro Carvalho Chehab 
> Reviewed-by: Igor Mammedov 
> ---
>  hw/acpi/aml-build.c| 10 ++
>  hw/acpi/generic_event_device.c |  8 
>  include/hw/acpi/acpi_dev_interface.h   |  1 +
>  include/hw/acpi/aml-build.h|  2 ++
>  include/hw/acpi/generic_event_device.h |  1 +
>  5 files changed, 22 insertions(+)
> 
> diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
> index 6d4517cfbe3d..cb167523859f 100644
> --- a/hw/acpi/aml-build.c
> +++ b/hw/acpi/aml-build.c
> @@ -2520,3 +2520,13 @@ Aml *aml_i2c_serial_bus_device(uint16_t address, const 
> char *resource_source)
>  
>  return var;
>  }
> +
> +/* ACPI 5.0: 18.3.2.6.2 Event Notification For Generic Error Sources */

should be ACPI 5.0b

> +Aml *aml_error_device(void)
> +{
> +Aml *dev = aml_device(ACPI_APEI_ERROR_DEVICE);
> +aml_append(dev, aml_name_decl("_HID", aml_string("PNP0C33")));
> +aml_append(dev, aml_name_decl("_UID", aml_int(0)));
> +
> +return dev;
> +}
> diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
> index 15b4c3ebbf24..b4c83a089a02 100644
> --- a/hw/acpi/generic_event_device.c
> +++ b/hw/acpi/generic_event_device.c
> @@ -26,6 +26,7 @@ static const uint32_t ged_supported_events[] = {
>  ACPI_GED_PWR_DOWN_EVT,
>  ACPI_GED_NVDIMM_HOTPLUG_EVT,
>  ACPI_GED_CPU_HOTPLUG_EVT,
> +ACPI_GED_ERROR_EVT,
>  };
>  
>  /*
> @@ -116,6 +117,11 @@ void build_ged_aml(Aml *table, const char *name, 
> HotplugHandler *hotplug_dev,
> aml_notify(aml_name(ACPI_POWER_BUTTON_DEVICE),
>aml_int(0x80)));
>  break;
> +case ACPI_GED_ERROR_EVT:
> +aml_append(if_ctx,
> +   aml_notify(aml_name(ACPI_APEI_ERROR_DEVICE),
> +  aml_int(0x80)));
> +break;
>  case ACPI_GED_NVDIMM_HOTPLUG_EVT:
>  aml_append(if_ctx,
> aml_notify(aml_name("\\_SB.NVDR"),
> @@ -295,6 +301,8 @@ static void acpi_ged_send_event(AcpiDeviceIf *adev, 
> AcpiEventStatusBits ev)
>  sel = ACPI_GED_MEM_HOTPLUG_EVT;
>  } else if (ev & ACPI_POWER_DOWN_STATUS) {
>  sel = ACPI_GED_PWR_DOWN_EVT;
> +} else if (ev & ACPI_GENERIC_ERROR) {
> +sel = ACPI_GED_ERROR_EVT;
>  } else if (ev & ACPI_NVDIMM_HOTPLUG_STATUS) {
>  sel = ACPI_GED_NVDIMM_HOTPLUG_EVT;
>  } else if (ev & ACPI_CPU_HOTPLUG_STATUS) {
> diff --git a/include/hw/acpi/acpi_dev_interface.h 
> b/include/hw/acpi/acpi_dev_interface.h
> index 68d9d15f50aa..8294f8f0ccca 100644
> --- a/include/hw/acpi/acpi_dev_interface.h
> +++ b/include/hw/acpi/acpi_dev_interface.h
> @@ -13,6 +13,7 @@ typedef enum {
>  ACPI_NVDIMM_HOTPLUG_STATUS = 16,
>  ACPI_VMGENID_CHANGE_STATUS = 32,
>  ACPI_POWER_DOWN_STATUS = 64,
> +ACPI_GENERIC_ERROR = 128,
>  } AcpiEventStatusBits;
>  
>  #define TYPE_ACPI_DEVICE_IF "acpi-device-interface"
> diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
> index a3784155cb33..44d1a6af0c69 100644
> --- a/include/hw/acpi/aml-build.h
> +++ b/include/hw/acpi/aml-build.h
> @@ -252,6 +252,7 @@ struct CrsRangeSet {
>  /* Consumer/Producer */
>  #define AML_SERIAL_BUS_FLAG_CONSUME_ONLY(1 << 1)
>  
> +#define ACPI_APEI_ERROR_DEVICE   "GEDD"
>  /**
>   * init_aml_allocator:
>   *
> @@ -382,6 +383,7 @@ Aml *aml_dma(AmlDmaType typ, AmlDmaBusMaster bm, 
> AmlTransferSize sz,
>   uint8_t channel);
>  Aml *aml_sleep(uint64_t msec);
>  Aml *aml_i2c_serial_bus_device(uint16_t address, const char 
> *resource_source);
> +Aml *aml_error_device(void);
>  
>  /* Block AML object primitives */
>  Aml *aml_scope(const char *name_format, ...) G_GNUC_PRINTF(1, 2);
> diff --git a/include/hw/acpi/generic_event_device.h 
> b/include/hw/acpi/generic_event_device.h
> index 40af3550b56d..9ace8fe70328 100644
> --- a/include/hw/acpi/generic_event_device.h
> +++ b/include/hw/acpi/generic_event_device.h
> @@ -98,6 +98,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(AcpiGedState, ACPI_GED)
>  #define ACPI_GED_PWR_DOWN_EVT  0x2
>  #define ACPI_GED_NVDIMM_HOTPLUG_EVT 0x4
>  #define ACPI_GED_CPU_HOTPLUG_EVT0x8
> +#define ACPI_GED_ERROR_EVT  0x10
>  
>  typedef struct GEDSta

[PATCH] gitlab-ci: Build MSYS2 job using multiple CPUs

2024-08-19 Thread Philippe Mathieu-Daudé
Signed-off-by: Philippe Mathieu-Daudé 
---
I don't know how to use Powershell do use nproc+1 jobs
to optimize jobs waiting on I/O.
---
 .gitlab-ci.d/windows.yml | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/.gitlab-ci.d/windows.yml b/.gitlab-ci.d/windows.yml
index a83f23a786..35ccb74fee 100644
--- a/.gitlab-ci.d/windows.yml
+++ b/.gitlab-ci.d/windows.yml
@@ -110,6 +110,7 @@ msys2-64bit:
   mingw-w64-x86_64-usbredir
   mingw-w64-x86_64-zstd"
   - Write-Output "Running build at $(Get-Date -Format u)"
+  - $env:JOBS = $(.\msys64\usr\bin\bash -lc nproc)
   - $env:CHERE_INVOKING = 'yes'  # Preserve the current working directory
   - $env:MSYS = 'winsymlinks:native' # Enable native Windows symlink
   - $env:CCACHE_BASEDIR = "$env:CI_PROJECT_DIR"
@@ -121,7 +122,7 @@ msys2-64bit:
   - cd build
   - ..\msys64\usr\bin\bash -lc "ccache --zero-stats"
   - ..\msys64\usr\bin\bash -lc "../configure --enable-fdt=system 
$CONFIGURE_ARGS"
-  - ..\msys64\usr\bin\bash -lc "make"
+  - ..\msys64\usr\bin\bash -lc "make -j$env:JOBS"
   - ..\msys64\usr\bin\bash -lc "make check MTESTARGS='$TEST_ARGS' || { cat 
meson-logs/testlog.txt; exit 1; } ;"
   - ..\msys64\usr\bin\bash -lc "ccache --show-stats"
   - Write-Output "Finished build at $(Get-Date -Format u)"
-- 
2.45.2




[PATCH v2 0/2] riscv: char: Avoid dropped charecters

2024-08-19 Thread Alistair Francis
This series fixes: https://gitlab.com/qemu-project/qemu/-/issues/2114

This converts the RISC-V charecter device callers of qemu_chr_fe_write()
to either use qemu_chr_fe_write_all() or to call qemu_chr_fe_write() async
and act on the return value.

v2:
 - Use Fifo8 for the Sifive UART instead of a custom FIFO

Alistair Francis (2):
  hw/char: riscv_htif: Use blocking qemu_chr_fe_write_all
  hw/char: sifive_uart: Print uart charecters async

 include/hw/char/sifive_uart.h | 17 ++-
 hw/char/riscv_htif.c  | 12 -
 hw/char/sifive_uart.c | 88 +--
 3 files changed, 109 insertions(+), 8 deletions(-)

-- 
2.46.0




[PATCH v2 2/2] hw/char: sifive_uart: Print uart charecters async

2024-08-19 Thread Alistair Francis
The current approach of using qemu_chr_fe_write() and ignoring the
return values results in dropped charecters [1].

Let's update the SiFive UART to use a async sifive_uart_xmit() function
to transmit the charecters and apply back preassure to the guest with
the SIFIVE_UART_TXFIFO_FULL status.

This should avoid dropped charecters and more realisiticly model the
hardware.

1: https://gitlab.com/qemu-project/qemu/-/issues/2114

Signed-off-by: Alistair Francis 
---
 include/hw/char/sifive_uart.h | 17 ++-
 hw/char/sifive_uart.c | 88 +--
 2 files changed, 99 insertions(+), 6 deletions(-)

diff --git a/include/hw/char/sifive_uart.h b/include/hw/char/sifive_uart.h
index 7f6c79f8bd..b43109bb8b 100644
--- a/include/hw/char/sifive_uart.h
+++ b/include/hw/char/sifive_uart.h
@@ -24,6 +24,7 @@
 #include "hw/qdev-properties.h"
 #include "hw/sysbus.h"
 #include "qom/object.h"
+#include "qemu/fifo8.h"
 
 enum {
 SIFIVE_UART_TXFIFO= 0,
@@ -48,9 +49,13 @@ enum {
 SIFIVE_UART_IP_RXWM   = 2  /* Receive watermark interrupt pending */
 };
 
+#define SIFIVE_UART_TXFIFO_FULL0x8000
+
 #define SIFIVE_UART_GET_TXCNT(txctrl)   ((txctrl >> 16) & 0x7)
 #define SIFIVE_UART_GET_RXCNT(rxctrl)   ((rxctrl >> 16) & 0x7)
+
 #define SIFIVE_UART_RX_FIFO_SIZE 8
+#define SIFIVE_UART_TX_FIFO_SIZE 8
 
 #define TYPE_SIFIVE_UART "riscv.sifive.uart"
 OBJECT_DECLARE_SIMPLE_TYPE(SiFiveUARTState, SIFIVE_UART)
@@ -63,13 +68,21 @@ struct SiFiveUARTState {
 qemu_irq irq;
 MemoryRegion mmio;
 CharBackend chr;
-uint8_t rx_fifo[SIFIVE_UART_RX_FIFO_SIZE];
-uint8_t rx_fifo_len;
+
+uint32_t txfifo;
 uint32_t ie;
 uint32_t ip;
 uint32_t txctrl;
 uint32_t rxctrl;
 uint32_t div;
+
+uint8_t rx_fifo[SIFIVE_UART_RX_FIFO_SIZE];
+uint8_t rx_fifo_len;
+
+Fifo8 tx_fifo;
+
+QEMUTimer *fifo_trigger_handle;
+uint64_t char_tx_time;
 };
 
 SiFiveUARTState *sifive_uart_create(MemoryRegion *address_space, hwaddr base,
diff --git a/hw/char/sifive_uart.c b/hw/char/sifive_uart.c
index 7fc6787f69..07730e241c 100644
--- a/hw/char/sifive_uart.c
+++ b/hw/char/sifive_uart.c
@@ -64,6 +64,72 @@ static void sifive_uart_update_irq(SiFiveUARTState *s)
 }
 }
 
+static gboolean sifive_uart_xmit(void *do_not_use, GIOCondition cond,
+ void *opaque)
+{
+SiFiveUARTState *s = opaque;
+int ret;
+const uint8_t *charecters;
+uint32_t numptr = 0;
+
+/* instant drain the fifo when there's no back-end */
+if (!qemu_chr_fe_backend_connected(&s->chr)) {
+fifo8_reset(&s->tx_fifo);
+return G_SOURCE_REMOVE;
+}
+
+if (fifo8_is_empty(&s->tx_fifo)) {
+return G_SOURCE_REMOVE;
+}
+
+/* Don't pop the FIFO incase the write fails */
+charecters = fifo8_peek_bufptr(&s->tx_fifo,
+   fifo8_num_used(&s->tx_fifo), &numptr);
+ret = qemu_chr_fe_write(&s->chr, charecters, numptr);
+
+if (ret >= 0) {
+/* We wrote the data, actuallly pop the fifo */
+fifo8_pop_bufptr(&s->tx_fifo, ret, NULL);
+}
+
+if (!fifo8_is_empty(&s->tx_fifo)) {
+guint r = qemu_chr_fe_add_watch(&s->chr, G_IO_OUT | G_IO_HUP,
+sifive_uart_xmit, s);
+if (!r) {
+fifo8_reset(&s->tx_fifo);
+return G_SOURCE_REMOVE;
+}
+}
+
+/* Clear the TX Full bit */
+if (!fifo8_is_full(&s->tx_fifo)) {
+s->txfifo &= ~SIFIVE_UART_TXFIFO_FULL;
+}
+
+sifive_uart_update_irq(s);
+return G_SOURCE_REMOVE;
+}
+
+static void sifive_uart_write_tx_fifo(SiFiveUARTState *s, const uint8_t *buf,
+  int size)
+{
+uint64_t current_time = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
+
+if (size > fifo8_num_free(&s->tx_fifo)) {
+size = fifo8_num_free(&s->tx_fifo);
+qemu_log_mask(LOG_GUEST_ERROR, "sifive_uart: TX FIFO overflow");
+}
+
+fifo8_push_all(&s->tx_fifo, buf, size);
+
+if (fifo8_is_full(&s->tx_fifo)) {
+s->txfifo |= SIFIVE_UART_TXFIFO_FULL;
+}
+
+timer_mod(s->fifo_trigger_handle, current_time +
+  (s->char_tx_time * 4));
+}
+
 static uint64_t
 sifive_uart_read(void *opaque, hwaddr addr, unsigned int size)
 {
@@ -82,7 +148,7 @@ sifive_uart_read(void *opaque, hwaddr addr, unsigned int 
size)
 return 0x8000;
 
 case SIFIVE_UART_TXFIFO:
-return 0; /* Should check tx fifo */
+return s->txfifo;
 case SIFIVE_UART_IE:
 return s->ie;
 case SIFIVE_UART_IP:
@@ -106,12 +172,10 @@ sifive_uart_write(void *opaque, hwaddr addr,
 {
 SiFiveUARTState *s = opaque;
 uint32_t value = val64;
-unsigned char ch = value;
 
 switch (addr) {
 case SIFIVE_UART_TXFIFO:
-qemu_chr_fe_write(&s->chr, &ch, 1);
-sifive_uart_update_irq(s);
+sifive_uart_write_tx_fifo(s, (uint8_t *) &value, 1);
 return;
 case SIFIVE_U

[PATCH v2 1/2] hw/char: riscv_htif: Use blocking qemu_chr_fe_write_all

2024-08-19 Thread Alistair Francis
The current approach of using qemu_chr_fe_write() and ignoring the
return values results in dropped charecters [1]. Ideally we want to
report FIFO status to the guest, but the HTIF isn't a real UART, so we
don't really have a way to do that.

Instead let's just use qemu_chr_fe_write_all() so at least we don't drop
charecters.

1: https://gitlab.com/qemu-project/qemu/-/issues/2114

Signed-off-by: Alistair Francis 
Reviewed-by: Daniel Henrique Barboza 
---
 hw/char/riscv_htif.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/hw/char/riscv_htif.c b/hw/char/riscv_htif.c
index 9bef60def1..d40b542948 100644
--- a/hw/char/riscv_htif.c
+++ b/hw/char/riscv_htif.c
@@ -218,7 +218,11 @@ static void htif_handle_tohost_write(HTIFState *s, 
uint64_t val_written)
 tswap64(syscall[3]) == HTIF_CONSOLE_CMD_PUTC) {
 uint8_t ch;
 cpu_physical_memory_read(tswap64(syscall[2]), &ch, 1);
-qemu_chr_fe_write(&s->chr, &ch, 1);
+/*
+ * XXX this blocks entire thread. Rewrite to use
+ * qemu_chr_fe_write and background I/O callbacks
+ */
+qemu_chr_fe_write_all(&s->chr, &ch, 1);
 resp = 0x100 | (uint8_t)payload;
 } else {
 qemu_log_mask(LOG_UNIMP,
@@ -237,7 +241,11 @@ static void htif_handle_tohost_write(HTIFState *s, 
uint64_t val_written)
 return;
 } else if (cmd == HTIF_CONSOLE_CMD_PUTC) {
 uint8_t ch = (uint8_t)payload;
-qemu_chr_fe_write(&s->chr, &ch, 1);
+/*
+ * XXX this blocks entire thread. Rewrite to use
+ * qemu_chr_fe_write and background I/O callbacks
+ */
+qemu_chr_fe_write_all(&s->chr, &ch, 1);
 resp = 0x100 | (uint8_t)payload;
 } else {
 qemu_log("HTIF device %d: unknown command\n", device);
-- 
2.46.0




Re: [PATCH v8 03/13] acpi/ghes: Add support for GED error device

2024-08-19 Thread Igor Mammedov
On Fri, 16 Aug 2024 09:37:35 +0200
Mauro Carvalho Chehab  wrote:

> From: Jonathan Cameron 
> 
> As a GED error device is now defined, add another type
> of notification.
> 
> Add error notification to GHES v2 using
>a GED error device GED triggered via interrupt.
 
This is hard to parse, perhaps update so it would be
more clear what does what

> 
> [mchehab: do some cleanups at ACPI_HEST_SRC_ID_* checks and
>  rename HEST event to better identify GED interrupt OSPM]
> 
> Signed-off-by: Jonathan Cameron 
> Signed-off-by: Mauro Carvalho Chehab 
> Reviewed-by: Igor Mammedov 
> ---

in addition to change log in cover letter,
I'd suggest to keep per patch change log as well (after ---),
it helps reviewer to notice intended changes.


[...]
> +case ACPI_HEST_SRC_ID_GED:
> +build_ghes_hw_error_notification(table_data, ACPI_GHES_NOTIFY_GPIO);
While GPIO works for arm, it's not the case for other machines.
I recall a suggestion to use ACPI_GHES_NOTIFY_EXTERNAL instead of GPIO one,
but that got lost somewhere...

> +break;
>  default:
>  error_report("Not support this error source");
>  abort();
> @@ -370,6 +376,7 @@ void acpi_build_hest(GArray *table_data, BIOSLinker 
> *linker,
>  /* Error Source Count */
>  build_append_int_noprefix(table_data, ACPI_GHES_ERROR_SOURCE_COUNT, 4);
>  build_ghes_v2(table_data, ACPI_HEST_SRC_ID_SEA, linker);
> +build_ghes_v2(table_data, ACPI_HEST_SRC_ID_GED, linker);
>  
>  acpi_table_end(linker, &table);
>  }
> diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
> index fb80897e7eac..419a97d5cbd9 100644
> --- a/include/hw/acpi/ghes.h
> +++ b/include/hw/acpi/ghes.h
> @@ -59,9 +59,10 @@ enum AcpiGhesNotifyType {
>  ACPI_GHES_NOTIFY_RESERVED = 12
>  };
>  
> +/* Those are used as table indexes when building GHES tables */
>  enum {
>  ACPI_HEST_SRC_ID_SEA = 0,
> -/* future ids go here */
> +ACPI_HEST_SRC_ID_GED,
>  ACPI_HEST_SRC_ID_RESERVED,
>  };
>  




Re: [PULL 15/21] chardev: set record/replay on the base device of a muxed device

2024-08-19 Thread Peter Maydell
On Thu, 15 Aug 2024 at 15:53, Alex Bennée  wrote:
>
> From: Nicholas Piggin 
>
> chardev events to a muxed device don't get recorded because e.g.,
> qemu_chr_be_write() checks whether the base device has the record flag
> set.
>
> This can be seen when replaying a trace that has characters typed into
> the console, an examination of the log shows they are not recorded.
>
> Setting QEMU_CHAR_FEATURE_REPLAY on the base chardev fixes the problem.

Hi; Coverity points out a bug in this code (CID 1559470):

> -Chardev *qemu_chr_new_from_opts(QemuOpts *opts, GMainContext *context,
> -Error **errp)
> +static void qemu_chardev_set_replay(Chardev *chr, Error **errp)
> +{
> +if (replay_mode != REPLAY_MODE_NONE) {
> +if (CHARDEV_GET_CLASS(chr)->chr_ioctl) {
> +error_setg(errp, "Replay: ioctl is not supported "
> + "for serial devices yet");
> +return;
> +}
> +qemu_chr_set_feature(chr, QEMU_CHAR_FEATURE_REPLAY);
> +replay_register_char_driver(chr);
> +}
> +}

qemu_chardev_set_replay() assumes it is passed a non NULL
'chr' pointer...

> @@ -693,14 +720,22 @@ Chardev *qemu_chr_new_noreplay(const char *label, const 
> char *filename,
>  Error *err = NULL;
>
>  if (strstart(filename, "chardev:", &p)) {
> -return qemu_chr_find(p);
> +chr = qemu_chr_find(p);

...but qemu_chr_find() returns NULL if it can't find the
chardev, and we don't catch that here...

> +if (replay) {
> +qemu_chardev_set_replay(chr, &err);

...so we will pass it to qemu_chardev_set_replay(), which
dumps core:

$ ./build/x86/qemu-system-arm -icount
shift=auto,rr=record,rrfile=replay.bin  -serial chardev:bang -M virt
Segmentation fault (core dumped)

(Compare the non-rr behaviour:
$ ./build/x86/qemu-system-arm  -serial chardev:bang -M virt
qemu-system-arm: -serial chardev:bang: could not connect serial device
to character backend 'chardev:bang'
)

Would you mind sending in a patch to fix this?

>  opts = qemu_chr_parse_compat(label, filename, permit_mux_mon);
>  if (!opts)
>  return NULL;
>
> -chr = qemu_chr_new_from_opts(opts, context, &err);
> +chr = __qemu_chr_new_from_opts(opts, context, replay, &err);
>  if (!chr) {
>  error_report_err(err);
>  goto out;

Side note: the "__" prefix is reserved for the system, so
we don't generally use it in QEMU function names. Could
you also submit a patch to rename the __qemu_chr_new()
and __qemu_chr_new_from_opts() functions, please?
(One common pattern for this kind of "function that does
the actual work behind foo()" is to call it "do_foo()".)

thanks
-- PMM



Re: [PATCH] qemu-guest-agent: Update the logfile path of qga-fsfreeze-hook.log

2024-08-19 Thread Dehan Meng
'/var/log/qemu-ga' is more reasonable and forward-looking to facilitate
future log management. All qga-related logs would be better placed in a
dedicated and unified log directory.

On Wed, Aug 14, 2024 at 7:54 PM Konstantin Kostiuk 
wrote:

> This bug looks specific to the RedHat SELinux configuration.
> Is this any reason to move LOGFILE except this?
>
> Best Regards,
> Konstantin Kostiuk.
>
>
> On Tue, Aug 13, 2024 at 6:11 AM Dehan Meng  wrote:
>
>> Since '/var/log/qga-fsfreeze-hook.log' is not included to proper
>> selinux context 'system_u:object_r:virt_qemu_ga_log_t:s0', it
>> should be changed to '/var/log/qemu-ga/qga-fsfreeze-hook.log'
>>
>> Jira: https://issues.redhat.com/browse/RHEL-52250
>> Signed-off-by: Dehan Meng 
>> ---
>>  scripts/qemu-guest-agent/fsfreeze-hook | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/scripts/qemu-guest-agent/fsfreeze-hook
>> b/scripts/qemu-guest-agent/fsfreeze-hook
>> index 13aafd4845..98aad5e18b 100755
>> --- a/scripts/qemu-guest-agent/fsfreeze-hook
>> +++ b/scripts/qemu-guest-agent/fsfreeze-hook
>> @@ -7,7 +7,7 @@
>>  # "freeze" argument before the filesystem is frozen. And for
>> fsfreeze-thaw
>>  # request, it is issued with "thaw" argument after filesystem is thawed.
>>
>> -LOGFILE=/var/log/qga-fsfreeze-hook.log
>> +LOGFILE=/var/log/qemu-ga/qga-fsfreeze-hook.log
>>  FSFREEZE_D=$(dirname -- "$0")/fsfreeze-hook.d
>>
>>  # Check whether file $1 is a backup or rpm-generated file and should be
>> ignored
>> --
>> 2.40.1
>>
>>


Re: [PATCH v8 04/13] qapi/acpi-hest: add an interface to do generic CPER error injection

2024-08-19 Thread Igor Mammedov
On Fri, 16 Aug 2024 09:37:36 +0200
Mauro Carvalho Chehab  wrote:

> Creates a QMP command to be used for generic ACPI APEI hardware error
> injection (HEST) via GHESv2.
> 
> The actual GHES code will be added at the followup patch.

modulo inconsistency in comments (see below), LGTM

> 
> Signed-off-by: Mauro Carvalho Chehab 
> Signed-off-by: Shiju Jose 
> Reviewed-by: Jonathan Cameron 
> ---
>  MAINTAINERS  |  7 +++
>  hw/acpi/Kconfig  |  5 +
>  hw/acpi/ghes_cper.c  | 33 +
>  hw/acpi/ghes_cper_stub.c | 19 +++
>  hw/acpi/meson.build  |  2 ++
>  hw/arm/Kconfig   |  5 +
>  include/hw/acpi/ghes.h   |  3 +++
>  qapi/acpi-hest.json  | 36 
>  qapi/meson.build |  1 +
>  qapi/qapi-schema.json|  1 +
>  10 files changed, 112 insertions(+)
>  create mode 100644 hw/acpi/ghes_cper.c
>  create mode 100644 hw/acpi/ghes_cper_stub.c
>  create mode 100644 qapi/acpi-hest.json
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 3584d6a6c6da..1d8091818899 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -2077,6 +2077,13 @@ F: hw/acpi/ghes.c
>  F: include/hw/acpi/ghes.h
>  F: docs/specs/acpi_hest_ghes.rst
>  
> +ACPI/HEST/GHES/ARM processor CPER
> +R: Mauro Carvalho Chehab 
> +S: Maintained
> +F: hw/arm/ghes_cper.c
> +F: hw/acpi/ghes_cper_stub.c
> +F: qapi/acpi-hest.json
> +
>  ppc4xx
>  L: qemu-...@nongnu.org
>  S: Orphan
> diff --git a/hw/acpi/Kconfig b/hw/acpi/Kconfig
> index e07d3204eb36..73ffbb82c150 100644
> --- a/hw/acpi/Kconfig
> +++ b/hw/acpi/Kconfig
> @@ -51,6 +51,11 @@ config ACPI_APEI
>  bool
>  depends on ACPI
>  
> +config GHES_CPER
> +bool
> +depends on ACPI_APEI
> +default y
> +
>  config ACPI_PCI
>  bool
>  depends on ACPI && PCI
> diff --git a/hw/acpi/ghes_cper.c b/hw/acpi/ghes_cper.c
> new file mode 100644
> index ..92ca84d738de
> --- /dev/null
> +++ b/hw/acpi/ghes_cper.c
> @@ -0,0 +1,33 @@
> +/*
> + * CPER payload parser for error injection
> + *
> + * Copyright(C) 2024 Huawei LTD.
> + *
> + * This code is licensed under the GPL version 2 or later. See the
> + * COPYING file in the top-level directory.
> + *
> + */
> +
> +#include "qemu/osdep.h"
> +
> +#include "qemu/base64.h"
> +#include "qemu/error-report.h"
> +#include "qemu/uuid.h"
> +#include "qapi/qapi-commands-acpi-hest.h"
> +#include "hw/acpi/ghes.h"
> +
> +void qmp_ghes_cper(const char *qmp_cper,
> +   Error **errp)
> +{
> +
> +uint8_t *cper;
> +size_t  len;
> +
> +cper = qbase64_decode(qmp_cper, -1, &len, errp);
> +if (!cper) {
> +error_setg(errp, "missing GHES CPER payload");
> +return;
> +}
> +
> +/* TODO: call a function at ghes */
> +}
> diff --git a/hw/acpi/ghes_cper_stub.c b/hw/acpi/ghes_cper_stub.c
> new file mode 100644
> index ..36138c462ac9
> --- /dev/null
> +++ b/hw/acpi/ghes_cper_stub.c
> @@ -0,0 +1,19 @@
> +/*
> + * Stub interface for CPER payload parser for error injection
> + *
> + * Copyright(C) 2024 Huawei LTD.
> + *
> + * This code is licensed under the GPL version 2 or later. See the
> + * COPYING file in the top-level directory.
> + *
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qapi/error.h"
> +#include "qapi/qapi-commands-acpi-hest.h"
> +#include "hw/acpi/ghes.h"
> +
> +void qmp_ghes_cper(const char *cper, Error **errp)
> +{
> +error_setg(errp, "GHES QMP error inject is not compiled in");
> +}
> diff --git a/hw/acpi/meson.build b/hw/acpi/meson.build
> index fa5c07db9068..6cbf430eb66d 100644
> --- a/hw/acpi/meson.build
> +++ b/hw/acpi/meson.build
> @@ -34,4 +34,6 @@ endif
>  system_ss.add(when: 'CONFIG_ACPI', if_false: files('acpi-stub.c', 
> 'aml-build-stub.c', 'ghes-stub.c', 'acpi_interface.c'))
>  system_ss.add(when: 'CONFIG_ACPI_PCI_BRIDGE', if_false: 
> files('pci-bridge-stub.c'))
>  system_ss.add_all(when: 'CONFIG_ACPI', if_true: acpi_ss)
> +system_ss.add(when: 'CONFIG_GHES_CPER', if_true: files('ghes_cper.c'))
> +system_ss.add(when: 'CONFIG_GHES_CPER', if_false: files('ghes_cper_stub.c'))
>  system_ss.add(files('acpi-qmp-cmds.c'))
> diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
> index 1ad60da7aa2d..bed6ba27d715 100644
> --- a/hw/arm/Kconfig
> +++ b/hw/arm/Kconfig
> @@ -712,3 +712,8 @@ config ARMSSE
>  select UNIMP
>  select SSE_COUNTER
>  select SSE_TIMER
> +
> +config GHES_CPER
> +bool
> +depends on ARM
> +default y if AARCH64
> diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
> index 419a97d5cbd9..b977d65564ba 100644
> --- a/include/hw/acpi/ghes.h
> +++ b/include/hw/acpi/ghes.h
> @@ -23,6 +23,7 @@
>  #define ACPI_GHES_H
>  
>  #include "hw/acpi/bios-linker-loader.h"
> +#include "qapi/error.h"
>  #include "qemu/notify.h"
>  
>  extern NotifierList acpi_generic_error_notifiers;
> @@ -77,6 +78,8 @@ void acpi_build_hest(GArray *table_data, BIOSLinker *linker,
>  void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
>

RE: [PATCH RFC V3 01/29] arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id property

2024-08-19 Thread Salil Mehta via
HI Gavin,

Sorry, I was away for almost entire last week. Joined back today.
Thanks for taking out time to review. 

>  From: Gavin Shan 
>  Sent: Monday, August 12, 2024 5:36 AM
>  To: Salil Mehta ; qemu-devel@nongnu.org;
>  qemu-...@nongnu.org; m...@redhat.com
>  
>  On 6/14/24 9:36 AM, Salil Mehta wrote:
>  > This shall be used to store user specified
>  > topology{socket,cluster,core,thread}
>  > and shall be converted to a unique 'vcpu-id' which is used as
>  > slot-index during hot(un)plug of vCPU.
>  >
>  > Co-developed-by: Keqian Zhu 
>  > Signed-off-by: Keqian Zhu 
>  > Signed-off-by: Salil Mehta 
>  > ---
>  >   hw/arm/virt.c | 10 ++
>  >   include/hw/arm/virt.h | 28 
>  >   target/arm/cpu.c  |  4 
>  >   target/arm/cpu.h  |  4 
>  >   4 files changed, 46 insertions(+)
>  >
>  
>  Those 4 properties are introduced to determine the vCPU's slot, which is the
>  index to MachineState::possible_cpus::cpus[]. 

Correct.

From there, the CPU object
>  or instance is referenced and then the CPU's state can be further
>  determined. It sounds reasonable to use the CPU's topology to determine
>  the index. However, I'm wandering if this can be simplified to use 'cpu-
>  index' or 'index' 

Are you suggesting to use CPU index while specifying vCPUs through
command line and I'm not even sure how will it simply CPU naming?

CPU index is internal index to QOM. The closest thing which you can
have is the 'slot-id'  and later can have mapping to the CPU index
internally but I'm not sure how much useful it is to introduce this 
slot abstraction. I did raise this in the original RFC I posted in 2020.


for a couple of facts: (1) 'cpu-index'
>  or 'index' is simplified. Users have to provide 4 parameters in order to
>  determine its index in the extreme case, for example "device_add host-
>  arm-cpu, id=cpu7,socket-id=1, cluster-id=1,core-id=1,thread-id=1". With
>  'cpu-index' or 'index', it can be simplified to 'index=7'. (2) The 
> cold-booted
>  and hotpluggable CPUs are determined by their index instead of their
>  topology. For example, CPU0/1/2/3 are cold-booted CPUs while CPU4/5/6/7
>  are hotpluggable CPUs with command lines '-smp maxcpus=8,cpus=4'. So
>  'index' makes more sense to identify a vCPU's slot.


I'm not sure if anybody wants to use it this way. People want to specify 
topology
i.e. where the vCPU fits. Internally it's up to QOM to translate that topology 
to
some index.


>  
>  > diff --git a/hw/arm/virt.c b/hw/arm/virt.c index
>  > 3c93c0c0a6..11fc7fc318 100644
>  > --- a/hw/arm/virt.c
>  > +++ b/hw/arm/virt.c
>  > @@ -2215,6 +2215,14 @@ static void machvirt_init(MachineState
>  *machine)
>  > &error_fatal);
>  >
>  >   aarch64 &= object_property_get_bool(cpuobj, "aarch64",
>  > NULL);
>  > +object_property_set_int(cpuobj, "socket-id",
>  > +virt_get_socket_id(machine, n), NULL);
>  > +object_property_set_int(cpuobj, "cluster-id",
>  > +virt_get_cluster_id(machine, n), NULL);
>  > +object_property_set_int(cpuobj, "core-id",
>  > +virt_get_core_id(machine, n), NULL);
>  > +object_property_set_int(cpuobj, "thread-id",
>  > +virt_get_thread_id(machine, n),
>  > + NULL);
>  >
>  >   if (!vms->secure) {
>  >   object_property_set_bool(cpuobj, "has_el3", false,
>  > NULL); @@ -2708,6 +2716,7 @@ static const CPUArchIdList
>  *virt_possible_cpu_arch_ids(MachineState *ms)
>  >   {
>  >   int n;
>  >   unsigned int max_cpus = ms->smp.max_cpus;
>  > +unsigned int smp_threads = ms->smp.threads;
>  >   VirtMachineState *vms = VIRT_MACHINE(ms);
>  >   MachineClass *mc = MACHINE_GET_CLASS(vms);
>  >
>  > @@ -2721,6 +2730,7 @@ static const CPUArchIdList
>  *virt_possible_cpu_arch_ids(MachineState *ms)
>  >   ms->possible_cpus->len = max_cpus;
>  >   for (n = 0; n < ms->possible_cpus->len; n++) {
>  >   ms->possible_cpus->cpus[n].type = ms->cpu_type;
>  > +ms->possible_cpus->cpus[n].vcpus_count = smp_threads;
>  >   ms->possible_cpus->cpus[n].arch_id =
>  >   virt_cpu_mp_affinity(vms, n);
>  >
>  
>  Why @vcpus_count is initialized to @smp_threads? it needs to be
>  documented in the commit log.


Because every thread internally amounts to a vCPU in QOM and which
is in 1:1 relationship with KVM vCPU. AFAIK, QOM does not strictly follows
any architecture. Once you start to get into details of threads there
are many aspects of shared resources one will have to consider and
these can vary across different implementations of architecture.

It is a bigger problem than you think, which I've touched at very nascent
stages while doing POC of vCPU hotplug but tried to avoid till now. 


But I would like to hear other community members views on this.

Hi Igor/Peter,

What is your take

[RFC-PATCH v2] vhost-user: add a request-reply lock

2024-08-19 Thread Prasad Pandit
From: Prasad Pandit 

QEMU threads use vhost_user_write/read calls to send
and receive request/reply messages from a vhost-user
device. When multiple threads communicate with the
same vhost-user device, they can receive each other's
messages, resulting in an erroneous state.

When fault_thread exits upon completion of Postcopy
migration, it sends a 'postcopy_end' message to the
vhost-user device. But sometimes 'postcopy_end' message
is sent while vhost device is being setup via
vhost_dev_start().

 Thread-1   Thread-2

 vhost_dev_startpostcopy_ram_incoming_cleanup
 vhost_device_iotlb_misspostcopy_notify
 vhost_backend_update_device_iotlb  vhost_user_postcopy_notifier
 vhost_user_send_device_iotlb_msg   vhost_user_postcopy_end
 process_message_reply  process_message_reply
 vhost_user_readvhost_user_read
 vhost_user_read_header vhost_user_read_header
 "Fail to update device iotlb"  "Failed to receive reply to postcopy_end"

This creates confusion when vhost-user device receives
'postcopy_end' message while it is trying to update
IOTLB entries.

 vhost_user_read_header:
  700871,700871: Failed to read msg header. Flags 0x0 instead of 0x5.
 vhost_device_iotlb_miss:
  700871,700871: Fail to update device iotlb
 vhost_user_postcopy_end:
  700871,700900: Failed to receive reply to postcopy_end
 vhost_user_read_header:
  700871,700871: Failed to read msg header. Flags 0x0 instead of 0x5.

Here fault thread seems to end the postcopy migration
while another thread is starting the vhost-user device.

Add a mutex lock to hold for one request-reply cycle
and avoid such race condition.

Fixes: 46343570c06e ("vhost+postcopy: Wire up POSTCOPY_END notify")
Suggested-by: Peter Xu 
Signed-off-by: Prasad Pandit 
---
 hw/virtio/vhost-user.c | 74 ++
 include/hw/virtio/vhost-user.h |  3 ++
 2 files changed, 77 insertions(+)

v2:
 - Place QEMU_LOCK_GUARD near the vhost_user_write() calls, holding
   the lock for longer fails some tests during rpmbuild(8).
 - rpmbuild(8) fails for some SRPMs, not all. RHEL-9 SRPM builds with
   this patch, whereas Fedora SRPM does not build.
 - The host OS also seems to affect rpmbuild(8). Some SRPMs build well
   on RHEL-9, but not on Fedora-40 machine.

v1: 
https://lore.kernel.org/qemu-devel/20240808095147.291626-3-ppan...@redhat.com/#R

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 00561daa06..7b030ae2cd 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -24,6 +24,7 @@
 #include "qemu/main-loop.h"
 #include "qemu/uuid.h"
 #include "qemu/sockets.h"
+#include "qemu/lockable.h"
 #include "sysemu/runstate.h"
 #include "sysemu/cryptodev.h"
 #include "migration/postcopy-ram.h"
@@ -446,6 +447,10 @@ static int vhost_user_set_log_base(struct vhost_dev *dev, 
uint64_t base,
 .hdr.size = sizeof(msg.payload.log),
 };
 
+struct vhost_user *u = dev->opaque;
+struct VhostUserState *us = u->user;
+QEMU_LOCK_GUARD(&us->vhost_user_request_reply_lock);
+
 /* Send only once with first queue pair */
 if (dev->vq_index != 0) {
 return 0;
@@ -664,6 +669,7 @@ static int send_remove_regions(struct vhost_dev *dev,
bool reply_supported)
 {
 struct vhost_user *u = dev->opaque;
+struct VhostUserState *us = u->user;
 struct vhost_memory_region *shadow_reg;
 int i, fd, shadow_reg_idx, ret;
 ram_addr_t offset;
@@ -685,6 +691,8 @@ static int send_remove_regions(struct vhost_dev *dev,
 vhost_user_fill_msg_region(®ion_buffer, shadow_reg, 0);
 msg->payload.mem_reg.region = region_buffer;
 
+QEMU_LOCK_GUARD(&us->vhost_user_request_reply_lock);
+
 ret = vhost_user_write(dev, msg, NULL, 0);
 if (ret < 0) {
 return ret;
@@ -718,6 +726,7 @@ static int send_add_regions(struct vhost_dev *dev,
 bool reply_supported, bool track_ramblocks)
 {
 struct vhost_user *u = dev->opaque;
+struct VhostUserState *us = u->user;
 int i, fd, ret, reg_idx, reg_fd_idx;
 struct vhost_memory_region *reg;
 MemoryRegion *mr;
@@ -746,6 +755,8 @@ static int send_add_regions(struct vhost_dev *dev,
 vhost_user_fill_msg_region(®ion_buffer, reg, offset);
 msg->payload.mem_reg.region = region_buffer;
 
+QEMU_LOCK_GUARD(&us->vhost_user_request_reply_lock);
+
 ret = vhost_user_write(dev, msg, &fd, 1);
 if (ret < 0) {
 return ret;
@@ -893,6 +904,7 @@ static int vhost_user_set_mem_table_postcopy(struct 
vhost_dev *dev,
  bool config_mem_slots)
 {
 struct vhost_user *u = dev->opaque;
+struct VhostUserState *us = u->user;
 int fds[VHOST_MEMORY_BASELINE_NREGIONS];
 size_t fd_num = 0;
 VhostUserMsg msg_reply;
@@ -926,6 +938,8 @@ static int v

RE: [PATCH RFC V3 01/29] arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id property

2024-08-19 Thread Salil Mehta via
>  From: Igor Mammedov 
>  Sent: Monday, August 12, 2024 9:16 AM
>  To: Gavin Shan 
>  
>  On Mon, 12 Aug 2024 14:35:56 +1000
>  Gavin Shan  wrote:
>  
>  > On 6/14/24 9:36 AM, Salil Mehta wrote:
>  > > This shall be used to store user specified
>  > > topology{socket,cluster,core,thread}
>  > > and shall be converted to a unique 'vcpu-id' which is used as
>  > > slot-index during hot(un)plug of vCPU.
>  > >
>  > > Co-developed-by: Keqian Zhu 
>  > > Signed-off-by: Keqian Zhu 
>  > > Signed-off-by: Salil Mehta 
>  > > ---
>  > >   hw/arm/virt.c | 10 ++
>  > >   include/hw/arm/virt.h | 28 
>  > >   target/arm/cpu.c  |  4 
>  > >   target/arm/cpu.h  |  4 
>  > >   4 files changed, 46 insertions(+)
>  > >
>  >
>  > Those 4 properties are introduced to determine the vCPU's slot, which
>  > is the index to MachineState::possible_cpus::cpus[]. From there, the
>  > CPU object or instance is referenced and then the CPU's state can be
>  > further determined. It sounds reasonable to use the CPU's topology to
>  > determine the index. However, I'm wandering if this can be simplified to
>  use 'cpu-index' or 'index' for a couple of facts: (1) 'cpu-index'
>  
>  Please, don't. We've spent a bunch of time to get rid of cpu-index in user
>  visible interface (well, old NUMA CLI is still there along with 'new' 
> topology
>  based one, but that's the last one).


Agreed. We shouldn't expose CPU index to user.

>  
>  > or 'index' is simplified. Users have to provide 4 parameters in order
>  > to determine its index in the extreme case, for example "device_add
>  > host-arm-cpu, id=cpu7,socket-id=1,
>  > cluster-id=1,core-id=1,thread-id=1". With 'cpu-index' or 'index', it
>  > can be simplified to 'index=7'. (2) The cold-booted and hotpluggable
>  > CPUs are determined by their index instead of their topology. For
>  > example, CPU0/1/2/3 are cold-booted CPUs while CPU4/5/6/7 are
>  hotpluggable CPUs with command lines '-smp maxcpus=8,cpus=4'. So 'index'
>  makes more sense to identify a vCPU's slot.
>  cpu-index have been used for hotplug with x86 machines as a starting point
>  to implement hotplug as it was easy to hack and it has already existed in
>  QEMU.
>  
>  But that didn't scale as was desired and had its own issues.
>  Hence the current interface that majority agreed upon.
>  I don't remember exact arguments anymore (they could be found qemu-
>  devel if needed) Here is a link to the talk that tried to explain why topo
>  based was introduced.
>  
>  http://events17.linuxfoundation.org/sites/events/files/slides/CPU%20Hot-
>  plug%20support%20in%20QEMU.pdf


I think you are referring to slide-19 of above presentation?

Thanks
Salil.



Re: [PATCH v8 05/13] acpi/ghes: rework the logic to handle HEST source ID

2024-08-19 Thread Igor Mammedov
On Fri, 16 Aug 2024 09:37:37 +0200
Mauro Carvalho Chehab  wrote:

> The current logic is based on a lot of duct tape, with
> offsets calculated based on one define with the number of
> source IDs and an enum.
> 
> Rewrite the logic in a way that it would be more resilient
> of code changes, by moving the source ID count to an enum
> and make the offset calculus more explicit.
> 
> Such change was inspired on a patch from Jonathan Cameron
> splitting the logic to get the CPER address on a separate
> function, as this will be needed to support generic error
> injection.

patch does too many things, that it's hard to review.
Please split it up on smaller distinct parts, with more specific
commit messages. (see some comments below)


> Signed-off-by: Mauro Carvalho Chehab 
> ---
>  hw/acpi/ghes-stub.c  |   3 +-
>  hw/acpi/ghes.c   | 210 ---
>  hw/arm/virt-acpi-build.c |   5 +-
>  include/hw/acpi/ghes.h   |  17 ++--
>  4 files changed, 138 insertions(+), 97 deletions(-)
> 
> diff --git a/hw/acpi/ghes-stub.c b/hw/acpi/ghes-stub.c
> index c315de1802d6..8762449870b5 100644
> --- a/hw/acpi/ghes-stub.c
> +++ b/hw/acpi/ghes-stub.c
> @@ -11,7 +11,8 @@
>  #include "qemu/osdep.h"
>  #include "hw/acpi/ghes.h"
>  
> -int acpi_ghes_record_errors(uint8_t source_id, uint64_t physical_address)
> +int acpi_ghes_record_errors(enum AcpiGhesNotifyType notify,
> +uint64_t physical_address)
>  {
>  return -1;
>  }
> diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
> index df59fd35568c..7870f51e2a9e 100644
> --- a/hw/acpi/ghes.c
> +++ b/hw/acpi/ghes.c
> @@ -28,14 +28,23 @@
>  #include "hw/nvram/fw_cfg.h"
>  #include "qemu/uuid.h"
>  
> -#define ACPI_GHES_ERRORS_FW_CFG_FILE"etc/hardware_errors"
> -#define ACPI_GHES_DATA_ADDR_FW_CFG_FILE "etc/hardware_errors_addr"
> +#define ACPI_HW_ERROR_FW_CFG_FILE   "etc/hardware_errors"
> +#define ACPI_HW_ERROR_ADDR_FW_CFG_FILE  "etc/hardware_errors_addr"
split out renaming part into a presiding separate patch,
so it won't mask a new code

> +#define ACPI_HEST_ADDR_FW_CFG_FILE  "etc/acpi_table_hest_addr"
>  
>  /* The max size in bytes for one error block */
>  #define ACPI_GHES_MAX_RAW_DATA_LENGTH   (1 * KiB)
>  


> -/* Support ARMv8 SEA notification type error source and GPIO interrupt. */
> -#define ACPI_GHES_ERROR_SOURCE_COUNT2
> +/*
> + * ID numbers used to fill HEST source ID field
> + */
> +enum AcpiHestSourceId {
> +ACPI_HEST_SRC_ID_SEA,
> +ACPI_HEST_SRC_ID_GED,
> +
> +/* Shall be the last one */
> +ACPI_HEST_SRC_ID_COUNT
> +} AcpiHestSourceId;
>  
this rename also should go into its own separate patch.

>  /* Generic Hardware Error Source version 2 */
>  #define ACPI_GHES_SOURCE_GENERIC_ERROR_V2   10
> @@ -63,6 +72,19 @@
>   */
>  #define ACPI_GHES_GESB_SIZE 20
>  
> +/*
> + * Offsets with regards to the start of the HEST table stored at
> + * ags->hest_addr_le, according with the memory layout map at
> + * docs/specs/acpi_hest_ghes.rst.
> + */
> +
> +/* ACPI 4.0: 17.3.2 ACPI Error Source */
> +#define ACPI_HEST_HEADER_SIZE40
> +
> +/* ACPI 6.2: 18.3.2.8 Generic Hardware Error Source version 2 */
> +#define HEST_GHES_V2_TABLE_SIZE  92
> +#define GHES_ACK_OFFSET  (64 + GAS_ADDR_OFFSET + 
> ACPI_HEST_HEADER_SIZE)
> +
>  /*
>   * Values for error_severity field
>   */
> @@ -236,17 +258,17 @@ static int acpi_ghes_record_mem_error(uint64_t 
> error_block_address,
>   * Initialize "etc/hardware_errors" and "etc/hardware_errors_addr" fw_cfg 
> blobs.
>   * See docs/specs/acpi_hest_ghes.rst for blobs format.
>   */
> -void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker)
> +static void build_ghes_error_table(GArray *hardware_errors, BIOSLinker 
> *linker)
>  {
>  int i, error_status_block_offset;
>  
>  /* Build error_block_address */
> -for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
> +for (i = 0; i < ACPI_HEST_SRC_ID_COUNT; i++) {
>  build_append_int_noprefix(hardware_errors, 0, sizeof(uint64_t));
>  }
>  
>  /* Build read_ack_register */
> -for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
> +for (i = 0; i < ACPI_HEST_SRC_ID_COUNT; i++) {
>  /*
>   * Initialize the value of read_ack_register to 1, so GHES can be
>   * writable after (re)boot.
> @@ -261,20 +283,20 @@ void build_ghes_error_table(GArray *hardware_errors, 
> BIOSLinker *linker)
>  
>  /* Reserve space for Error Status Data Block */
>  acpi_data_push(hardware_errors,
> -ACPI_GHES_MAX_RAW_DATA_LENGTH * ACPI_GHES_ERROR_SOURCE_COUNT);
> +ACPI_GHES_MAX_RAW_DATA_LENGTH * ACPI_HEST_SRC_ID_COUNT);
>  
>  /* Tell guest firmware to place hardware_errors blob into RAM */
> -bios_linker_loader_alloc(linker, ACPI_GHES_ERRORS_FW_CFG_FILE,
> +bios_linker_loader_alloc(linker, ACPI_HW_ERROR_FW_CFG_FILE,
>   hardware_errors, sizeof(uint6

RE: [PATCH RFC V3 11/29] arm/virt: Create GED dev before *disabled* CPU Objs are destroyed

2024-08-19 Thread Salil Mehta via
>  From: Gavin Shan 
>  Sent: Tuesday, August 13, 2024 2:05 AM
>  To: Salil Mehta ; qemu-devel@nongnu.org;
>  qemu-...@nongnu.org; m...@redhat.com
>  
>  On 6/14/24 9:36 AM, Salil Mehta wrote:
>  > ACPI CPU hotplug state (is_present=_STA.PRESENT,
>  > is_enabled=_STA.ENABLED) for all the possible vCPUs MUST be
>  > initialized during machine init. This is done during the creation of
>  > the GED device. VMM/Qemu MUST expose/fake the ACPI state of the
>  > disabled vCPUs to the Guest kernel as 'present' (_STA.PRESENT) always
>  > i.e. ACPI persistent. if the 'disabled' vCPU objectes are destroyed
>  > before the GED device has been created then their ACPI hotplug state
>  > might not get initialized correctly as acpi_persistent flag is part of the
>  CPUState. This will expose wrong status of the unplugged vCPUs to the
>  Guest kernel.
>  >
>  > Hence, moving the GED device creation before disabled vCPU objects get
>  > destroyed as part of the post CPU init routine.
>  >
>  > Signed-off-by: Salil Mehta 
>  > ---
>  >   hw/arm/virt.c | 10 +++---
>  >   1 file changed, 7 insertions(+), 3 deletions(-)
>  >
>  > diff --git a/hw/arm/virt.c b/hw/arm/virt.c index
>  > 918bcb9a1b..5f98162587 100644
>  > --- a/hw/arm/virt.c
>  > +++ b/hw/arm/virt.c
>  > @@ -2467,6 +2467,12 @@ static void machvirt_init(MachineState
>  > *machine)
>  >
>  >   create_gic(vms, sysmem);
>  >
>  > +has_ged = has_ged && aarch64 && firmware_loaded &&
>  > +  virt_is_acpi_enabled(vms);
>  > +if (has_ged) {
>  > +vms->acpi_dev = create_acpi_ged(vms);
>  > +}
>  > +
>  >   virt_cpu_post_init(vms, sysmem);
>  >
>  >   fdt_add_pmu_nodes(vms);
>  > @@ -2489,9 +2495,7 @@ static void machvirt_init(MachineState
>  *machine)
>  >
>  >   create_pcie(vms);
>  >
>  > -if (has_ged && aarch64 && firmware_loaded &&
>  virt_is_acpi_enabled(vms)) {
>  > -vms->acpi_dev = create_acpi_ged(vms);
>  > -} else {
>  > +if (!has_ged) {
>  >   create_gpio_devices(vms, VIRT_GPIO, sysmem);
>  >   }
>  >
>  
>  It's likely the GPIO device can be created before those disabled CPU objects
>  are destroyed. It means the whole chunk of code can be moved together, I
>  think.

I was not totally sure of this. Hence, kept the order of the rest like that. I 
can
definitely check again if we can do that to reduce the change.

Thanks
Salil.



>  
>  Thanks,
>  Gavin
>  



Re: [PATCH v2 16/17] intel_iommu: Introduce a property to control FS1GP cap bit setting

2024-08-19 Thread Yi Liu

On 2024/8/19 17:41, Duan, Zhenzhong wrote:




-Original Message-
From: Liu, Yi L 
Subject: Re: [PATCH v2 16/17] intel_iommu: Introduce a property to control
FS1GP cap bit setting

On 2024/8/15 11:46, Duan, Zhenzhong wrote:




-Original Message-
From: Liu, Yi L 
Subject: Re: [PATCH v2 16/17] intel_iommu: Introduce a property to

control

FS1GP cap bit setting

On 2024/8/5 14:27, Zhenzhong Duan wrote:

When host IOMMU doesn't support FS1GP but vIOMMU does, host

IOMMU

can't translate stage-1 page table from guest correctly.


this series is for emulated devices, so the above statement does not
belong to this series. Is there any other reason to have this option?


Good catch, will remove this comment.
In fact, this patch is mainly for passthrough device where host IOMMU

doesn't support fs1gp.

I see. To me, as long as the vIOMMU page walk logic supports 1GP large
pages, it's ok to report the FS1GP cap to VM. But it is still fine to
have this property to opt-out FS1GP if admin/orchestration layer(e.g. libvirt)
knows no hw iommu has this capability, so it is better to opt out it
before invoking QEMU.

Is this your motivation for this property?


Exactly.


ok, then let's keep it in this series after refining the commit message a
bit.






Add a property x-cap-fs1gp for user to turn FS1GP off so that
nested page table on host side works.


I guess you would need to sync the FS1GP cap with host before reporting

it

in vIOMMU when comes to support passthrough devices.


Yes, we already have this check, see

https://github.com/yiliu1765/qemu/commit/b7ac7ce3a2e21eb1b3172743
ee6f73e80fe67b3a

good to know it. :) Will you fail the VM if the device's iommu does not
support FS1GP or just mask out the FS1GP?


For cold plugged VFIO device, it will fail the VM with "Stage-1 1GB huge page is 
unsupported by host IOMMU" error report.
For hotplug VFIO device, only hotplug fails with "Stage-1 1GB huge page is 
unsupported by host IOMMU".

We don't update vIOMMU cap/ecap from host cap/ecap per Michael's suggestion, 
only vIOMMU properties can control them.


I see. yeah, it makes sense.

--
Regards,
Yi Liu



RE: [PATCH RFC V3 17/29] arm/virt: Release objects for *disabled* possible vCPUs after init

2024-08-19 Thread Salil Mehta via
Hi Gavin,

>  From: Gavin Shan 
>  Sent: Tuesday, August 13, 2024 2:17 AM
>  To: Salil Mehta ; qemu-devel@nongnu.org;
>  qemu-...@nongnu.org; m...@redhat.com
>  
>  On 6/14/24 9:36 AM, Salil Mehta wrote:
>  > During `machvirt_init()`, QOM ARMCPU objects are pre-created along
>  > with the corresponding KVM vCPUs in the host for all possible vCPUs.
>  > This is necessary due to the architectural constraint that KVM
>  > restricts the deferred creation of KVM vCPUs and VGIC
>  > initialization/sizing after VM initialization. Hence, VGIC is pre-sized 
> with
>  possible vCPUs.
>  >
>  > After the initialization of the machine is complete, the disabled
>  > possible KVM vCPUs are parked in the per-virt-machine list
>  > "kvm_parked_vcpus," and we release the QOM ARMCPU objects for the
>  > disabled vCPUs. These will be re-created when the vCPU is hotplugged
>  > again. The QOM ARMCPU object is then re-attached to the corresponding
>  parked KVM vCPU.
>  >
>  > Alternatively, we could have chosen not to release the QOM CPU objects
>  > and kept reusing them. This approach might require some modifications
>  > to the `qdevice_add()` interface to retrieve the old ARMCPU object
>  > instead of creating a new one for the hotplug request.
>  >
>  > Each of these approaches has its own pros and cons. This prototype
>  > uses the first approach (suggestions are welcome!).
>  >
>  > Co-developed-by: Keqian Zhu 
>  > Signed-off-by: Keqian Zhu 
>  > Signed-off-by: Salil Mehta 
>  > ---
>  >   hw/arm/virt.c | 32 
>  >   1 file changed, 32 insertions(+)
>  >
>  > diff --git a/hw/arm/virt.c b/hw/arm/virt.c index
>  > 9d33f30a6a..a72cd3b20d 100644
>  > --- a/hw/arm/virt.c
>  > +++ b/hw/arm/virt.c
>  > @@ -2050,6 +2050,7 @@ static void virt_cpu_post_init(VirtMachineState
>  *vms, MemoryRegion *sysmem)
>  >   {
>  >   CPUArchIdList *possible_cpus = vms->parent.possible_cpus;
>  >   int max_cpus = MACHINE(vms)->smp.max_cpus;
>  > +MachineState *ms = MACHINE(vms);
>  >   bool aarch64, steal_time;
>  >   CPUState *cpu;
>  >   int n;
>  > @@ -2111,6 +2112,37 @@ static void virt_cpu_post_init(VirtMachineState
>  *vms, MemoryRegion *sysmem)
>  >   }
>  >   }
>  >   }
>  > +
>  > +if (kvm_enabled() || tcg_enabled()) {
>  > +for (n = 0; n < possible_cpus->len; n++) {
>  > +cpu = qemu_get_possible_cpu(n);
>  > +
>  > +/*
>  > + * Now, GIC has been sized with possible CPUs and we dont
>  require
>  > + * disabled vCPU objects to be represented in the QOM. Release
>  the
>  > + * disabled ARMCPU objects earlier used during init for 
> pre-sizing.
>  > + *
>  > + * We fake to the guest through ACPI about the
>  presence(_STA.PRES=1)
>  > + * of these non-existent vCPUs at VMM/qemu and present these
>  as
>  > + * disabled vCPUs(_STA.ENA=0) so that they cant be used. These
>  vCPUs
>  > + * can be later added to the guest through hotplug exchanges
>  when
>  > + * ARMCPU objects are created back again using 'device_add' 
> QMP
>  > + * command.
>  > + */
>  > +/*
>  > + * RFC: Question: Other approach could've been to keep them
>  forever
>  > + * and release it only once when qemu exits as part of 
> finalize or
>  > + * when new vCPU is hotplugged. In the later old could be 
> released
>  > + * for the newly created object for the same vCPU?
>  > + */
>  > +if (!qemu_enabled_cpu(cpu)) {
>  > +CPUArchId *cpu_slot;
>  > +cpu_slot = virt_find_cpu_slot(ms, cpu->cpu_index);
>  > +cpu_slot->cpu = NULL;
>  > +object_unref(OBJECT(cpu));
>  > +}
>  > +}
>  > +}
>  >   }
>  
>  It's probably hard to keep those ARMCPU objects forever. First of all, one
>  vCPU can be hot-added first and then hot-removed afterwards. With those
>  ARMCPU objects kept forever, the syntax of 'device_add' and 'device_del'
>  become broken at least.

I had prototyped both approaches 4 years back. Yes, interface problem with
device_add was solved by a trick of keeping the old vCPU object and on
device_add instead of creating a new vCPU object we could use the old vCPU
object and then call qdev_realize() on it.

But bigger problem with this approach is that of migration. Only realized 
objects
have state to be migrated. So it might look cleaner on one aspect but had its
own issues.

I think I did share a prototype of this with Igor which he was not in agreement 
with
and wanted vCPU objects to be destroyed like in x86. Hence, we stuck with
the current approach.


>  The ideal mechanism would be to avoid instanciating those ARMCPU objects
>  and destroying them soon. I don't know if ms->possible_cpus->cpus[] can
>  fit and how much efforts needed.

This is what 

RE: [PATCH RFC V3 18/29] arm/virt: Add/update basic hot-(un)plug framework

2024-08-19 Thread Salil Mehta via
Hi Gavin,

>  From: Gavin Shan 
>  Sent: Tuesday, August 13, 2024 2:21 AM
>  To: Salil Mehta ; qemu-devel@nongnu.org;
>  qemu-...@nongnu.org; m...@redhat.com
>  
>  On 6/14/24 9:36 AM, Salil Mehta wrote:
>  > Add CPU hot-unplug hooks and update hotplug hooks with additional
>  > sanity checks for use in hotplug paths.
>  >
>  > Note: The functional contents of the hooks (currently left with TODO
>  > comments) will be gradually filled in subsequent patches in an
>  > incremental approach to patch and logic building, which would roughly
>  include the following:
>  >
>  > 1. (Un)wiring of interrupts between vCPU<->GIC.
>  > 2. Sending events to the guest for hot-(un)plug so that the guest can take
>  > appropriate actions.
>  > 3. Notifying the GIC about the hot-(un)plug action so that the vCPU can be
>  > (un)stitched to the GIC CPU interface.
>  > 4. Updating the guest with next boot information for this vCPU in the
>  firmware.
>  >
>  > Co-developed-by: Keqian Zhu 
>  > Signed-off-by: Keqian Zhu 
>  > Signed-off-by: Salil Mehta 
>  > ---
>  >   hw/arm/virt.c | 105
>  ++
>  >   1 file changed, 105 insertions(+)
>  >
>  > diff --git a/hw/arm/virt.c b/hw/arm/virt.c index
>  > a72cd3b20d..f6b8c21f26 100644
>  > --- a/hw/arm/virt.c
>  > +++ b/hw/arm/virt.c
>  > @@ -85,6 +85,7 @@
>  >   #include "hw/virtio/virtio-iommu.h"
>  >   #include "hw/char/pl011.h"
>  >   #include "qemu/guest-random.h"
>  > +#include "qapi/qmp/qdict.h"
>  >
>  >   static GlobalProperty arm_virt_compat[] = {
>  >   { TYPE_VIRTIO_IOMMU_PCI, "aw-bits", "48" }, @@ -3002,11 +3003,23
>  > @@ static void virt_memory_plug(HotplugHandler *hotplug_dev,
>  >   static void virt_cpu_pre_plug(HotplugHandler *hotplug_dev, DeviceState
>  *dev,
>  > Error **errp)
>  >   {
>  > +VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
>  >   MachineState *ms = MACHINE(hotplug_dev);
>  > +MachineClass *mc = MACHINE_GET_CLASS(ms);
>  >   ARMCPU *cpu = ARM_CPU(dev);
>  >   CPUState *cs = CPU(dev);
>  >   CPUArchId *cpu_slot;
>  >
>  > +if (dev->hotplugged && !vms->acpi_dev) {
>  > +error_setg(errp, "GED acpi device does not exists");
>  > +return;
>  > +}
>  > +
>  > +if (dev->hotplugged && !mc->has_hotpluggable_cpus) {
>  > +error_setg(errp, "CPU hotplug not supported on this machine");
>  > +return;
>  > +}
>  > +
>  >   /* sanity check the cpu */
>  >   if (!object_dynamic_cast(OBJECT(cpu), ms->cpu_type)) {
>  >   error_setg(errp, "Invalid CPU type, expected cpu type:
>  > '%s'", @@ -3049,6 +3062,22 @@ static void
>  virt_cpu_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>  >   }
>  >   virt_cpu_set_properties(OBJECT(cs), cpu_slot, errp);
>  >
>  > +/*
>  > + * Fix the GIC for this new vCPU being plugged. The QOM CPU object 
> for the
>  > + * new vCPU need to be updated in the corresponding QOM GICv3CPUState 
> object
>  > + * We also need to re-wire the IRQs for this new CPU object. This 
> update
>  > + * is limited to the QOM only and does not affects the KVM. Later has
>  > + * already been pre-sized with possible CPU at VM init time. This is a
>  > + * workaround to the constraints posed by ARM architecture w.r.t 
> supporting
>  > + * CPU Hotplug. Specification does not exist for the later.
>  > + * This patch-up is required both for {cold,hot}-plugged vCPUs. 
> Cold-inited
>  > + * vCPUs have their GIC state initialized during machvit_init().
>  > + */
>  > +if (vms->acpi_dev) {
>  > +/* TODO: update GIC about this hotplug change here */
>  > +/* TODO: wire the GIC<->CPU irqs */
>  > +}
>  > +
>  >   /*
>  >* To give persistent presence view of vCPUs to the guest, ACPI 
> might need
>  >* to fake the presence of the vCPUs to the guest but keep them 
> disabled.
>  > @@ -3060,6 +3089,7 @@ static void virt_cpu_pre_plug(HotplugHandler
>  *hotplug_dev, DeviceState *dev,
>  >   static void virt_cpu_plug(HotplugHandler *hotplug_dev, DeviceState
>  *dev,
>  > Error **errp)
>  >   {
>  > +VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
>  >   MachineState *ms = MACHINE(hotplug_dev);
>  >   CPUState *cs = CPU(dev);
>  >   CPUArchId *cpu_slot;
>  > @@ -3068,10 +3098,81 @@ static void virt_cpu_plug(HotplugHandler
>  *hotplug_dev, DeviceState *dev,
>  >   cpu_slot = virt_find_cpu_slot(ms, cs->cpu_index);
>  >   cpu_slot->cpu = CPU(dev);
>  >
>  > +/*
>  > + * Update the ACPI Hotplug state both for vCPUs being {hot,cold}-
>  plugged.
>  > + * vCPUs can be cold-plugged using '-device' option. For vCPUs being
>  hot
>  > + * plugged, guest is also notified.
>  > + */
>  > +if (vms->acpi_dev) {
>  > +/* TODO: update acpi hotplug state. Send cpu hotplug event to 
> guest
>  */
>  > +/* TODO: re

RE: [PATCH RFC V3 24/29] target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug

2024-08-19 Thread Salil Mehta via
Hi Alex,

>  From: Alex Bennée 
>  Sent: Friday, August 16, 2024 4:37 PM
>  To: Salil Mehta 
>  
>  Salil Mehta  writes:
>  
>  > vCPU Hot-unplug will result in QOM CPU object unrealization which will
>  > do away with all the vCPU thread creations, allocations, registrations
>  > that happened as part of the realization process. This change
>  > introduces the ARM CPU unrealize function taking care of exactly that.
>  >
>  > Note, initialized KVM vCPUs are not destroyed in host KVM but their
>  > Qemu context is parked at the QEMU KVM layer.
>  >
>  > Co-developed-by: Keqian Zhu 
>  > Signed-off-by: Keqian Zhu 
>  > Signed-off-by: Salil Mehta 
>  > Reported-by: Vishnu Pajjuri 
>  > [VP: Identified CPU stall issue & suggested probable fix]
>  > Signed-off-by: Salil Mehta 
>  > ---
>  >  target/arm/cpu.c   | 101
>  +
>  >  target/arm/cpu.h   |  14 ++
>  >  target/arm/gdbstub.c   |   6 +++
>  >  target/arm/helper.c|  25 ++
>  >  target/arm/internals.h |   3 ++
>  >  target/arm/kvm.c   |   5 ++
>  >  6 files changed, 154 insertions(+)
>  >
>  > diff --git a/target/arm/cpu.c b/target/arm/cpu.c index
>  > c92162fa97..a3dc669309 100644
>  > --- a/target/arm/cpu.c
>  > +++ b/target/arm/cpu.c
>  > @@ -157,6 +157,16 @@ void
>  arm_register_pre_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn
>  *hook,
>  >  QLIST_INSERT_HEAD(&cpu->pre_el_change_hooks, entry, node);  }
>  >
>  > +void arm_unregister_pre_el_change_hooks(ARMCPU *cpu) {
>  > +ARMELChangeHook *entry, *next;
>  > +
>  > +QLIST_FOREACH_SAFE(entry, &cpu->pre_el_change_hooks, node,
>  next) {
>  > +QLIST_REMOVE(entry, node);
>  > +g_free(entry);
>  > +}
>  > +}
>  > +
>  >  void arm_register_el_change_hook(ARMCPU *cpu,
>  ARMELChangeHookFn *hook,
>  >   void *opaque)  { @@ -168,6 +178,16
>  > @@ void arm_register_el_change_hook(ARMCPU *cpu,
>  ARMELChangeHookFn *hook,
>  >  QLIST_INSERT_HEAD(&cpu->el_change_hooks, entry, node);  }
>  >
>  > +void arm_unregister_el_change_hooks(ARMCPU *cpu) {
>  > +ARMELChangeHook *entry, *next;
>  > +
>  > +QLIST_FOREACH_SAFE(entry, &cpu->el_change_hooks, node, next) {
>  > +QLIST_REMOVE(entry, node);
>  > +g_free(entry);
>  > +}
>  > +}
>  > +
>  >  static void cp_reg_reset(gpointer key, gpointer value, gpointer
>  > opaque)  {
>  >  /* Reset a single ARMCPRegInfo register */ @@ -2552,6 +2572,85 @@
>  > static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
>  >  acc->parent_realize(dev, errp);
>  >  }
>  >
>  > +static void arm_cpu_unrealizefn(DeviceState *dev) {
>  > +ARMCPUClass *acc = ARM_CPU_GET_CLASS(dev);
>  > +ARMCPU *cpu = ARM_CPU(dev);
>  > +CPUARMState *env = &cpu->env;
>  > +CPUState *cs = CPU(dev);
>  > +bool has_secure;
>  > +
>  > +has_secure = cpu->has_el3 || arm_feature(env,
>  > + ARM_FEATURE_M_SECURITY);
>  > +
>  > +/* rock 'n' un-roll, whatever happened in the arm_cpu_realizefn
>  cleanly */
>  > +cpu_address_space_destroy(cs, ARMASIdx_NS);
>  
>  On current master this will fail:
>  
>  ../../target/arm/cpu.c: In function ‘arm_cpu_unrealizefn’:
>  ../../target/arm/cpu.c:2626:5: error: implicit declaration of function
>  ‘cpu_address_space_destroy’ [-Werror=implicit-function-declaration]
>   2626 | cpu_address_space_destroy(cs, ARMASIdx_NS);
>| ^
>  ../../target/arm/cpu.c:2626:5: error: nested extern declaration of
>  ‘cpu_address_space_destroy’ [-Werror=nested-externs]
>  cc1: all warnings being treated as errors


The current master already has arch-agnostic patch-set. I've applied the
RFC V3 to the latest and complied. I did not see this issue?

I've create a new branch for your reference.

https://github.com/salil-mehta/qemu/tree/virt-cpuhp-armv8/rfc-v4-rc4

Please let me know if this works for you?


Thanks
Salil.



>  
>  --
>  Alex Bennée
>  Virtualisation Tech Lead @ Linaro


Re: [PATCH v8 06/13] acpi/ghes: add support for generic error injection via QAPI

2024-08-19 Thread Igor Mammedov
On Fri, 16 Aug 2024 09:37:38 +0200
Mauro Carvalho Chehab  wrote:

> Provide a generic interface for error injection via GHESv2.
> 
> This patch is co-authored:
> - original ghes logic to inject a simple ARM record by Shiju Jose;
> - generic logic to handle block addresses by Jonathan Cameron;
> - generic GHESv2 error inject by Mauro Carvalho Chehab;
> 
> Co-authored-by: Jonathan Cameron 
> Co-authored-by: Shiju Jose 
> Co-authored-by: Mauro Carvalho Chehab 
> Signed-off-by: Jonathan Cameron 
> Signed-off-by: Shiju Jose 
> Signed-off-by: Mauro Carvalho Chehab 
> ---
>  hw/acpi/ghes.c  | 57 +
>  hw/acpi/ghes_cper.c |  2 +-
>  2 files changed, 58 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
> index 7870f51e2a9e..a3ae710dcf81 100644
> --- a/hw/acpi/ghes.c
> +++ b/hw/acpi/ghes.c
> @@ -500,6 +500,63 @@ int acpi_ghes_record_errors(enum AcpiGhesNotifyType 
> notify,
>  NotifierList acpi_generic_error_notifiers =
>  NOTIFIER_LIST_INITIALIZER(error_device_notifiers);
>  
> +void ghes_record_cper_errors(uint8_t *cper, size_t len,
> + enum AcpiGhesNotifyType notify, Error **errp)
> +{
> +uint64_t cper_addr, read_ack_start_addr;
> +enum AcpiHestSourceId source;
> +AcpiGedState *acpi_ged_state;
> +AcpiGhesState *ags;
> +uint64_t read_ack;
> +
> +if (ghes_notify_to_source_id(notify, &source)) {
> +error_setg(errp,
> +   "GHES: Invalid error block/ack address(es) for notify %d",
> +   notify);
> +return;
> +}
> +
> +acpi_ged_state = ACPI_GED(object_resolve_path_type("", TYPE_ACPI_GED,
> +   NULL));
> +g_assert(acpi_ged_state);
> +ags = &acpi_ged_state->ghes_state;
> +
> +cper_addr = le64_to_cpu(ags->ghes_addr_le);
   ^^^ suggest to rename to error_block_address
   that way reader can easily match it with spec.

> +cper_addr += ACPI_HEST_SRC_ID_COUNT * sizeof(uint64_t);
and it would be better to merge this with previous line to be more clear
 + to avoid shifting meaning of variable between lines.

> +read_ack_start_addr = cper_addr + source * sizeof(uint64_t);

> +cper_addr += ACPI_HEST_SRC_ID_COUNT * sizeof(uint64_t);
> +cper_addr += source * ACPI_GHES_MAX_RAW_DATA_LENGTH;
I'd avoid changing meaning of variable, it adds up to confusion.
Anyway, what the point of of above math?  

> +
> +cpu_physical_memory_read(read_ack_start_addr,
> + &read_ack, sizeof(uint64_t));
s/sizeof(uint64_t)/sizeof(read_ack)/
ditto elsewhere

> +
> +/* zero means OSPM does not acknowledge the error */
> +if (!read_ack) {
> +error_setg(errp,
> +   "Last CPER record was not acknowledged yet");

> +read_ack = 1;
> +cpu_physical_memory_write(read_ack_start_addr,
> +  &read_ack, (uint64_t));
we don't do this for SEV so, why are you setting it to 1 here?


> +return;
> +}
> +
> +read_ack = cpu_to_le64(0);
> +cpu_physical_memory_write(read_ack_start_addr,
> +  &read_ack, sizeof(uint64_t));
> +
> +/* Build CPER record */
> +
> +if (len > ACPI_GHES_MAX_RAW_DATA_LENGTH) {
> +error_setg(errp, "GHES CPER record is too big: %ld", len);
> +}
move check at start of function?

> +
> +/* Write the generic error data entry into guest memory */
> +cpu_physical_memory_write(cper_addr, cper, len);
> +
> +notifier_list_notify(&acpi_generic_error_notifiers, NULL);
> +}
> +
>  bool acpi_ghes_present(void)
>  {
>  AcpiGedState *acpi_ged_state;
> diff --git a/hw/acpi/ghes_cper.c b/hw/acpi/ghes_cper.c
> index 92ca84d738de..2328dbff7012 100644
> --- a/hw/acpi/ghes_cper.c
> +++ b/hw/acpi/ghes_cper.c
> @@ -29,5 +29,5 @@ void qmp_ghes_cper(const char *qmp_cper,
>  return;
>  }
>  
> -/* TODO: call a function at ghes */
> +ghes_record_cper_errors(cper, len, ACPI_GHES_NOTIFY_GPIO, errp);
>  }




Re: [PATCH v8 12/13] acpi/ghes: cleanup generic error data logic

2024-08-19 Thread Igor Mammedov
On Fri, 16 Aug 2024 09:37:44 +0200
Mauro Carvalho Chehab  wrote:

> Remove comments that are obvious.
> 
> No functional changes.
> 
> Signed-off-by: Mauro Carvalho Chehab 
these comments help if you don't have spec side by side with code
to compare. I'd even say such comments are preferable than no comments
when composing an ACPI table.

pls, drop the patch

> ---
>  hw/acpi/ghes.c | 38 +++---
>  1 file changed, 15 insertions(+), 23 deletions(-)
> 
> diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
> index 4f7b6c5ad2b6..a822a5eafaa0 100644
> --- a/hw/acpi/ghes.c
> +++ b/hw/acpi/ghes.c
> @@ -130,34 +130,28 @@ static void build_ghes_hw_error_notification(GArray 
> *table, const uint8_t type)
>   * ACPI 6.1: 18.3.2.7.1 Generic Error Data
>   */
>  static void acpi_ghes_generic_error_data(GArray *table,
> -const uint8_t *section_type, uint32_t error_severity,
> -uint8_t validation_bits, uint8_t flags,
> -uint32_t error_data_length, QemuUUID fru_id,
> -uint64_t time_stamp)
> + const uint8_t *section_type,
> + uint32_t error_severity,
> + uint8_t validation_bits,
> + uint8_t flags,
> + uint32_t error_data_length,
> + QemuUUID fru_id,
> + uint64_t time_stamp)
>  {
>  const uint8_t fru_text[20] = {0};
>  
> -/* Section Type */
>  g_array_append_vals(table, section_type, 16);
> -
> -/* Error Severity */
>  build_append_int_noprefix(table, error_severity, 4);
> +
>  /* Revision */
>  build_append_int_noprefix(table, 0x300, 2);
> -/* Validation Bits */
> +
>  build_append_int_noprefix(table, validation_bits, 1);
> -/* Flags */
>  build_append_int_noprefix(table, flags, 1);
> -/* Error Data Length */
>  build_append_int_noprefix(table, error_data_length, 4);
>  
> -/* FRU Id */
>  g_array_append_vals(table, fru_id.data, ARRAY_SIZE(fru_id.data));
> -
> -/* FRU Text */
>  g_array_append_vals(table, fru_text, sizeof(fru_text));
> -
> -/* Timestamp */
>  build_append_int_noprefix(table, time_stamp, 8);
>  }
>  
> @@ -165,19 +159,17 @@ static void acpi_ghes_generic_error_data(GArray *table,
>   * Generic Error Status Block
>   * ACPI 6.1: 18.3.2.7.1 Generic Error Data
>   */
> -static void acpi_ghes_generic_error_status(GArray *table, uint32_t 
> block_status,
> -uint32_t raw_data_offset, uint32_t raw_data_length,
> -uint32_t data_length, uint32_t error_severity)
> +static void acpi_ghes_generic_error_status(GArray *table,
> +   uint32_t block_status,
> +   uint32_t raw_data_offset,
> +   uint32_t raw_data_length,
> +   uint32_t data_length,
> +   uint32_t error_severity)
>  {
> -/* Block Status */
>  build_append_int_noprefix(table, block_status, 4);
> -/* Raw Data Offset */
>  build_append_int_noprefix(table, raw_data_offset, 4);
> -/* Raw Data Length */
>  build_append_int_noprefix(table, raw_data_length, 4);
> -/* Data Length */
>  build_append_int_noprefix(table, data_length, 4);
> -/* Error Severity */
>  build_append_int_noprefix(table, error_severity, 4);
>  }
>  




RE: [PATCH RFC V3 24/29] target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug

2024-08-19 Thread Salil Mehta via
Hi Peter,

>  From: Peter Maydell 
>  Sent: Friday, August 16, 2024 4:51 PM
>  To: Alex Bennée 
>  
>  On Fri, 16 Aug 2024 at 16:37, Alex Bennée  wrote:
>  >
>  > Salil Mehta  writes:
>  >
>  > > vCPU Hot-unplug will result in QOM CPU object unrealization which
>  > > will do away with all the vCPU thread creations, allocations,
>  > > registrations that happened as part of the realization process. This
>  > > change introduces the ARM CPU unrealize function taking care of exactly
>  that.
>  > >
>  > > Note, initialized KVM vCPUs are not destroyed in host KVM but their
>  > > Qemu context is parked at the QEMU KVM layer.
>  > >
>  > > Co-developed-by: Keqian Zhu 
>  > > Signed-off-by: Keqian Zhu 
>  > > Signed-off-by: Salil Mehta 
>  > > Reported-by: Vishnu Pajjuri 
>  > > [VP: Identified CPU stall issue & suggested probable fix]
>  > > Signed-off-by: Salil Mehta 
>  > > ---
>  > >  target/arm/cpu.c   | 101
>  +
>  > >  target/arm/cpu.h   |  14 ++
>  > >  target/arm/gdbstub.c   |   6 +++
>  > >  target/arm/helper.c|  25 ++
>  > >  target/arm/internals.h |   3 ++
>  > >  target/arm/kvm.c   |   5 ++
>  > >  6 files changed, 154 insertions(+)
>  > >
>  > > diff --git a/target/arm/cpu.c b/target/arm/cpu.c index
>  > > c92162fa97..a3dc669309 100644
>  > > --- a/target/arm/cpu.c
>  > > +++ b/target/arm/cpu.c
>  > > @@ -157,6 +157,16 @@ void
>  arm_register_pre_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn
>  *hook,
>  > >  QLIST_INSERT_HEAD(&cpu->pre_el_change_hooks, entry, node);  }
>  > >
>  > > +void arm_unregister_pre_el_change_hooks(ARMCPU *cpu) {
>  > > +ARMELChangeHook *entry, *next;
>  > > +
>  > > +QLIST_FOREACH_SAFE(entry, &cpu->pre_el_change_hooks, node,
>  next) {
>  > > +QLIST_REMOVE(entry, node);
>  > > +g_free(entry);
>  > > +}
>  > > +}
>  > > +
>  > >  void arm_register_el_change_hook(ARMCPU *cpu,
>  ARMELChangeHookFn *hook,
>  > >   void *opaque)  { @@ -168,6 +178,16
>  > > @@ void arm_register_el_change_hook(ARMCPU *cpu,
>  ARMELChangeHookFn *hook,
>  > >  QLIST_INSERT_HEAD(&cpu->el_change_hooks, entry, node);  }
>  > >
>  > > +void arm_unregister_el_change_hooks(ARMCPU *cpu) {
>  > > +ARMELChangeHook *entry, *next;
>  > > +
>  > > +QLIST_FOREACH_SAFE(entry, &cpu->el_change_hooks, node, next)
>  {
>  > > +QLIST_REMOVE(entry, node);
>  > > +g_free(entry);
>  > > +}
>  > > +}
>  > > +
>  > >  static void cp_reg_reset(gpointer key, gpointer value, gpointer
>  > > opaque)  {
>  > >  /* Reset a single ARMCPRegInfo register */ @@ -2552,6 +2572,85
>  > > @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
>  > >  acc->parent_realize(dev, errp);  }
>  > >
>  > > +static void arm_cpu_unrealizefn(DeviceState *dev) {
>  > > +ARMCPUClass *acc = ARM_CPU_GET_CLASS(dev);
>  > > +ARMCPU *cpu = ARM_CPU(dev);
>  > > +CPUARMState *env = &cpu->env;
>  > > +CPUState *cs = CPU(dev);
>  > > +bool has_secure;
>  > > +
>  > > +has_secure = cpu->has_el3 || arm_feature(env,
>  > > + ARM_FEATURE_M_SECURITY);
>  > > +
>  > > +/* rock 'n' un-roll, whatever happened in the arm_cpu_realizefn
>  cleanly */
>  > > +cpu_address_space_destroy(cs, ARMASIdx_NS);
>  >
>  > On current master this will fail:
>  >
>  > ../../target/arm/cpu.c: In function ‘arm_cpu_unrealizefn’:
>  > ../../target/arm/cpu.c:2626:5: error: implicit declaration of function
>  ‘cpu_address_space_destroy’ [-Werror=implicit-function-declaration]
>  >  2626 | cpu_address_space_destroy(cs, ARMASIdx_NS);
>  >   | ^
>  > ../../target/arm/cpu.c:2626:5: error: nested extern declaration of
>  > ‘cpu_address_space_destroy’ [-Werror=nested-externs]
>  > cc1: all warnings being treated as errors
>  
>  We shouldn't need to explicitly call cpu_address_space_destroy() from a
>  target-specific unrealize anyway: we can do it all from the base class (and I
>  think this would fix some leaks in current code for targets that hot-unplug,
>  though I should check that). Otherwise you need to duplicate all the logic 
> for
>  figuring out which address spaces we created in realize, which is fragile and
>  not necessary when all we want to do is "delete every address space the
>  CPU object has"
>  and we want to do that for every target architecture always.


Agreed but I would suggest to make it optional i.e. in case architecture want
to release to from its code. It should be allowed.  This also ensures clarity 
of the
flows,

https://lore.kernel.org/qemu-devel/a308e1f4f06f4e3ab6ab51f353601...@huawei.com/


Thanks
Salil.



>  
>  -- PMM


RE: [PATCH RFC V3 24/29] target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug

2024-08-19 Thread Salil Mehta via
>  From: Peter Maydell 
>  Sent: Friday, August 16, 2024 6:00 PM
>  To: Alex Bennée 
>  
>  On Fri, 16 Aug 2024 at 16:50, Peter Maydell 
>  wrote:
>  > We shouldn't need to explicitly call cpu_address_space_destroy() from
>  > a target-specific unrealize anyway: we can do it all from the base
>  > class (and I think this would fix some leaks in current code for
>  > targets that hot-unplug, though I should check that). Otherwise you
>  > need to duplicate all the logic for figuring out which address spaces
>  > we created in realize, which is fragile and not necessary when all we
>  > want to do is "delete every address space the CPU object has"
>  > and we want to do that for every target architecture always.
>  
>  I have a patch to do this now, but I need to test it a bit more and confirm 
> (or
>  disprove) my hypothesis that we're currently leaking memory on existing
>  architectures with vCPU hot-unplug before I send it out.

I think you are referring to this patch?

https://lore.kernel.org/qemu-devel/20230918160257.30127-9-phi...@linaro.org/


>  
>  -- PMM


[PATCH] hyperv: Make sure SynIC state is really updated before KVM_RUN

2024-08-19 Thread Vitaly Kuznetsov
'hyperv_synic' test from KVM unittests was observed to be flaky on certain
hardware (hangs sometimes). Debugging shows that the problem happens in
hyperv_sint_route_new() when the test tries to set up a new SynIC
route. The function bails out on:

 if (!synic->sctl_enabled) {
 goto cleanup;
 }

but the test writes to HV_X64_MSR_SCONTROL just before it starts
establishing SINT routes. Further investigation shows that
synic_update() (called from async_synic_update()) happens after the SINT
setup attempt and not before. Apparently, the comment before
async_safe_run_on_cpu() in kvm_hv_handle_exit() does not correctly describe
the guarantees async_safe_run_on_cpu() gives. In particular, async worked
added to a CPU is actually processed from qemu_wait_io_event() which is not
always called before KVM_RUN, i.e. kvm_cpu_exec() checks whether an exit
request is pending for a CPU and if not, keeps running the vCPU until it
meets an exit it can't handle internally. Hyper-V specific MSR writes are
not automatically trigger an exit.

Fix the issue by simply raising an exit request for the vCPU where SynIC
update was queued. This is not a performance critical path as SynIC state
does not get updated so often (and async_safe_run_on_cpu() is a big hammer
anyways).

Reported-by: Jan Richter 
Signed-off-by: Vitaly Kuznetsov 
---
 target/i386/kvm/hyperv.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/target/i386/kvm/hyperv.c b/target/i386/kvm/hyperv.c
index b94f12acc2c9..70b89cacf94b 100644
--- a/target/i386/kvm/hyperv.c
+++ b/target/i386/kvm/hyperv.c
@@ -80,6 +80,7 @@ int kvm_hv_handle_exit(X86CPU *cpu, struct kvm_hyperv_exit 
*exit)
  * necessary because memory hierarchy is being changed
  */
 async_safe_run_on_cpu(CPU(cpu), async_synic_update, RUN_ON_CPU_NULL);
+cpu_exit(CPU(cpu));
 
 return EXCP_INTERRUPT;
 case KVM_EXIT_HYPERV_HCALL: {
-- 
2.46.0




RE: [PATCH RFC V3 06/29] arm/virt,kvm: Pre-create disabled possible vCPUs @machine init

2024-08-19 Thread Salil Mehta via
Hi Gavin,

>  From: Gavin Shan 
>  Sent: Monday, August 19, 2024 6:32 AM
>  To: Salil Mehta ; qemu-devel@nongnu.org;
>  qemu-...@nongnu.org; m...@redhat.com
>  
>  On 6/14/24 9:36 AM, Salil Mehta wrote:
>  > In the ARMv8 architecture, the GIC must know all the CPUs it is
>  > connected to during its initialization, and this cannot change
>  > afterward. This must be ensured during the initialization of the VGIC
>  > as well in KVM, which requires all vCPUs to be created and present
>  > during its initialization. This is necessary
>  > because:
>  >
>  > 1. The association between GICC and MPIDR must be fixed at VM
>  initialization
>  > time. This is represented by the register `GIC_TYPER(mp_affinity,
>  proc_num)`.
>  > 2. GICC (CPU interfaces), GICR (redistributors), etc., must all be 
> initialized
>  > at boot time.
>  > 3. Memory regions associated with GICR, etc., cannot be changed (added, 
> deleted,
>  > or modified) after the VM has been initialized.
>  >
>  > This patch adds support to pre-create all possible vCPUs within the
>  > host using the KVM interface as part of the virtual machine
>  > initialization. These vCPUs can later be attached to QOM/ACPI when
>  > they are actually hot-plugged and made present.
>  >
>  > Co-developed-by: Keqian Zhu 
>  > Signed-off-by: Keqian Zhu 
>  > Signed-off-by: Salil Mehta 
>  > Reported-by: Vishnu Pajjuri 
>  > [VP: Identified CPU stall issue & suggested probable fix]
>  > ---
>  >   hw/arm/virt.c | 56 +++-
>  ---
>  >   include/hw/core/cpu.h |  1 +
>  >   target/arm/cpu64.c|  1 +
>  >   target/arm/kvm.c  | 41 ++-
>  >   target/arm/kvm_arm.h  | 11 +
>  >   5 files changed, 99 insertions(+), 11 deletions(-)
>  >
>  
>  The vCPU file descriptor is associated with a feature bitmap when the file
>  descriptor is initialized by ioctl(vm_fd, KVM_ARM_VCPU_INIT, &init). The
>  feature bitmap is sorted out based on the vCPU properties. The vCPU
>  properties can be different when the vCPU file descriptor is initialized for
>  the first time when the vCPU is instantiated, and re-initialized when the
>  vCPU is hot added.

  
>  It can lead to system crash as below. We probably need a mechanism to
>  disallow passing extra properties when vCPU is hot added to avoid the
>  conflicts to the global properties from the command line "-cpu
>  host,pmu=on". Some of the properties like "id", "socket-id"
>  are still needed.


Yes, Good catch. I knew that but It almost went under my hood. Thanks for
pointing and reminding it. We need a check there. Will fix it.


>  
>  /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64  \
>  -accel kvm -machine virt,gic-version=host,nvdimm=on  \
>  -cpu host -smp maxcpus=2,cpus=1,sockets=2,clusters=1,cores=1,threads=1
>  \
>  -m 4096M,slots=16,maxmem=128G\
>  -object memory-backend-ram,id=mem0,size=2048M\
>  -object memory-backend-ram,id=mem1,size=2048M\
>  -numa node,nodeid=0,memdev=mem0,cpus=0-0 \
>  -numa node,nodeid=1,memdev=mem1,cpus=1-1 \
>  -L /home/gavin/sandbox/qemu.main/build/pc-bios   \
>  -monitor none -serial mon:stdio -nographic   \
>  -gdb tcp:: -qmp tcp:localhost:,server,wait=off   \
>  -bios /home/gavin/sandbox/qemu.main/build/pc-bios/edk2-aarch64-
>  code.fd   \
>  -kernel /home/gavin/sandbox/linux.guest/arch/arm64/boot/Image\
>  -initrd /home/gavin/sandbox/images/rootfs.cpio.xz\
>  -append memhp_default_state=online_movable   \
>   :
>  (qemu) device_add host-arm-cpu,id=cpu1,socket-id=1,pmu=off
>  kvm_arch_init_vcpu: Error -22 from kvm_arm_vcpu_init()
>  qemu-system-aarch64: kvm_init_vcpu: kvm_arch_init_vcpu failed (1):
>  Invalid argument

Yes. thanks.

>  
>  Thanks,
>  Gavin
>  



Re: [PATCH] gitlab-ci: Build MSYS2 job using multiple CPUs

2024-08-19 Thread Thomas Huth



According to 
https://docs.gitlab.com/ee/ci/runners/hosted_runners/windows.html the 
Windows shared runner should have 2 vCPUs nowadays, indeed! Maybe worth to 
mention it in the patch description?


Also, how much faster does the job now run for you?

On 19/08/2024 13.21, Philippe Mathieu-Daudé wrote:

Signed-off-by: Philippe Mathieu-Daudé 
---
I don't know how to use Powershell do use nproc+1 jobs
to optimize jobs waiting on I/O.


Well, we're calling into bash there, so you could do it as part of the bash 
statement?


 Thomas


---
  .gitlab-ci.d/windows.yml | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/.gitlab-ci.d/windows.yml b/.gitlab-ci.d/windows.yml
index a83f23a786..35ccb74fee 100644
--- a/.gitlab-ci.d/windows.yml
+++ b/.gitlab-ci.d/windows.yml
@@ -110,6 +110,7 @@ msys2-64bit:
mingw-w64-x86_64-usbredir
mingw-w64-x86_64-zstd"
- Write-Output "Running build at $(Get-Date -Format u)"
+  - $env:JOBS = $(.\msys64\usr\bin\bash -lc nproc)
- $env:CHERE_INVOKING = 'yes'  # Preserve the current working directory
- $env:MSYS = 'winsymlinks:native' # Enable native Windows symlink
- $env:CCACHE_BASEDIR = "$env:CI_PROJECT_DIR"
@@ -121,7 +122,7 @@ msys2-64bit:
- cd build
- ..\msys64\usr\bin\bash -lc "ccache --zero-stats"
- ..\msys64\usr\bin\bash -lc "../configure --enable-fdt=system 
$CONFIGURE_ARGS"
-  - ..\msys64\usr\bin\bash -lc "make"
+  - ..\msys64\usr\bin\bash -lc "make -j$env:JOBS"
- ..\msys64\usr\bin\bash -lc "make check MTESTARGS='$TEST_ARGS' || { cat 
meson-logs/testlog.txt; exit 1; } ;"
- ..\msys64\usr\bin\bash -lc "ccache --show-stats"
- Write-Output "Finished build at $(Get-Date -Format u)"





Re: [PATCH RFC V3 24/29] target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug

2024-08-19 Thread Peter Maydell
On Mon, 19 Aug 2024 at 13:59, Salil Mehta  wrote:
>
> >  From: Peter Maydell 
> >  Sent: Friday, August 16, 2024 6:00 PM
> >  To: Alex Bennée 
> >
> >  On Fri, 16 Aug 2024 at 16:50, Peter Maydell 
> >  wrote:
> >  > We shouldn't need to explicitly call cpu_address_space_destroy() from
> >  > a target-specific unrealize anyway: we can do it all from the base
> >  > class (and I think this would fix some leaks in current code for
> >  > targets that hot-unplug, though I should check that). Otherwise you
> >  > need to duplicate all the logic for figuring out which address spaces
> >  > we created in realize, which is fragile and not necessary when all we
> >  > want to do is "delete every address space the CPU object has"
> >  > and we want to do that for every target architecture always.
> >
> >  I have a patch to do this now, but I need to test it a bit more and 
> > confirm (or
> >  disprove) my hypothesis that we're currently leaking memory on existing
> >  architectures with vCPU hot-unplug before I send it out.
>
> I think you are referring to this patch?
>
> https://lore.kernel.org/qemu-devel/20230918160257.30127-9-phi...@linaro.org/

I'd forgotten that Phil had sent that patch out. My patch
is a bit different because it refactors cpu_address_space_destroy()
into a single function that destroys all the ASes (and so we
don't for instance need cpu->cpu_ases_count any more).

-- PMM



Re: [PATCH RFC V3 24/29] target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug

2024-08-19 Thread Peter Maydell
On Mon, 19 Aug 2024 at 13:58, Salil Mehta  wrote:
>
> Hi Peter,
>
> >  From: Peter Maydell 
> >
> >  We shouldn't need to explicitly call cpu_address_space_destroy() from a
> >  target-specific unrealize anyway: we can do it all from the base class 
> > (and I
> >  think this would fix some leaks in current code for targets that 
> > hot-unplug,
> >  though I should check that). Otherwise you need to duplicate all the logic 
> > for
> >  figuring out which address spaces we created in realize, which is fragile 
> > and
> >  not necessary when all we want to do is "delete every address space the
> >  CPU object has"
> >  and we want to do that for every target architecture always.
>
>
> Agreed but I would suggest to make it optional i.e. in case architecture want
> to release to from its code. It should be allowed.  This also ensures clarity 
> of the
> flows,
>
> https://lore.kernel.org/qemu-devel/a308e1f4f06f4e3ab6ab51f353601...@huawei.com/

Do you have any concrete examples where a target arch would want to
explicitly release an AS from its own code? Unless there's a
real use case for doing that, I think that "common code always
does the cleanup of the ASes, nothing else ever does" is a
simple design rule that avoids the need for target-specific code
and means we don't need complicated handling for "some of the
ASes in cpu->cpu_ases are live and some have been released":
either the CPU is realized and they're all valid, or else
we're in the process of unrealizing the CPU and we get rid of
them all at once.

thanks
-- PMM



[PATCH] softmmu: Support concurrent bounce buffers

2024-08-19 Thread Mattias Nissler
When DMA memory can't be directly accessed, as is the case when
running the device model in a separate process without shareable DMA
file descriptors, bounce buffering is used.

It is not uncommon for device models to request mapping of several DMA
regions at the same time. Examples include:
 * net devices, e.g. when transmitting a packet that is split across
   several TX descriptors (observed with igb)
 * USB host controllers, when handling a packet with multiple data TRBs
   (observed with xhci)

Previously, qemu only provided a single bounce buffer per AddressSpace
and would fail DMA map requests while the buffer was already in use. In
turn, this would cause DMA failures that ultimately manifest as hardware
errors from the guest perspective.

This change allocates DMA bounce buffers dynamically instead of
supporting only a single buffer. Thus, multiple DMA mappings work
correctly also when RAM can't be mmap()-ed.

The total bounce buffer allocation size is limited individually for each
AddressSpace. The default limit is 4096 bytes, matching the previous
maximum buffer size. A new x-max-bounce-buffer-size parameter is
provided to configure the limit for PCI devices.

Signed-off-by: Mattias Nissler 
Reviewed-by: Philippe Mathieu-Daudé 
Acked-by: Peter Xu 
---
This patch is split out from my "Support message-based DMA in vfio-user server"
series. With the series having been partially applied, I'm splitting this one
out as the only remaining patch to system emulation code in the hope to
simplify getting it landed. The code has previously been reviewed by Stefan
Hajnoczi and Peter Xu. This latest version includes changes to switch the
bounce buffer size bookkeeping to `size_t` as requested and LGTM'd by Phil in
v9.
---
 hw/pci/pci.c|  8 
 include/exec/memory.h   | 14 +++
 include/hw/pci/pci_device.h |  3 ++
 system/memory.c |  5 ++-
 system/physmem.c| 82 ++---
 5 files changed, 76 insertions(+), 36 deletions(-)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index fab86d0567..d2caf3ee8b 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -85,6 +85,8 @@ static Property pci_props[] = {
 QEMU_PCIE_ERR_UNC_MASK_BITNR, true),
 DEFINE_PROP_BIT("x-pcie-ari-nextfn-1", PCIDevice, cap_present,
 QEMU_PCIE_ARI_NEXTFN_1_BITNR, false),
+DEFINE_PROP_SIZE32("x-max-bounce-buffer-size", PCIDevice,
+ max_bounce_buffer_size, DEFAULT_MAX_BOUNCE_BUFFER_SIZE),
 DEFINE_PROP_END_OF_LIST()
 };
 
@@ -1204,6 +1206,8 @@ static PCIDevice *do_pci_register_device(PCIDevice 
*pci_dev,
"bus master container", UINT64_MAX);
 address_space_init(&pci_dev->bus_master_as,
&pci_dev->bus_master_container_region, pci_dev->name);
+pci_dev->bus_master_as.max_bounce_buffer_size =
+pci_dev->max_bounce_buffer_size;
 
 if (phase_check(PHASE_MACHINE_READY)) {
 pci_init_bus_master(pci_dev);
@@ -2633,6 +2637,10 @@ static void pci_device_class_init(ObjectClass *klass, 
void *data)
 k->unrealize = pci_qdev_unrealize;
 k->bus_type = TYPE_PCI_BUS;
 device_class_set_props(k, pci_props);
+object_class_property_set_description(
+klass, "x-max-bounce-buffer-size",
+"Maximum buffer size allocated for bounce buffers used for mapped "
+"access to indirect DMA memory");
 }
 
 static void pci_device_class_base_init(ObjectClass *klass, void *data)
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 296fd068c0..e5e865d1a9 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -1084,13 +1084,7 @@ typedef struct AddressSpaceMapClient {
 QLIST_ENTRY(AddressSpaceMapClient) link;
 } AddressSpaceMapClient;
 
-typedef struct {
-MemoryRegion *mr;
-void *buffer;
-hwaddr addr;
-hwaddr len;
-bool in_use;
-} BounceBuffer;
+#define DEFAULT_MAX_BOUNCE_BUFFER_SIZE (4096)
 
 /**
  * struct AddressSpace: describes a mapping of addresses to #MemoryRegion 
objects
@@ -1110,8 +1104,10 @@ struct AddressSpace {
 QTAILQ_HEAD(, MemoryListener) listeners;
 QTAILQ_ENTRY(AddressSpace) address_spaces_link;
 
-/* Bounce buffer to use for this address space. */
-BounceBuffer bounce;
+/* Maximum DMA bounce buffer size used for indirect memory map requests */
+size_t max_bounce_buffer_size;
+/* Total size of bounce buffers currently allocated, atomically accessed */
+size_t bounce_buffer_size;
 /* List of callbacks to invoke when buffers free up */
 QemuMutex map_client_list_lock;
 QLIST_HEAD(, AddressSpaceMapClient) map_client_list;
diff --git a/include/hw/pci/pci_device.h b/include/hw/pci/pci_device.h
index 15694f2489..91df40f989 100644
--- a/include/hw/pci/pci_device.h
+++ b/include/hw/pci/pci_device.h
@@ -167,6 +167,9 @@ struct PCIDevice {
 /* ID of standby device in net_failover pair */
 char *failover_pair_id;
 uint32_t acpi_index;
+
+

Re: [PATCH v3] hw/ppc: Implement -dtb support for PowerNV

2024-08-19 Thread Aditya Gupta

Hi Cedric,


On 15/08/24 23:22, Cédric Le Goater wrote:


I don't think this is a bug fix. is it ? AFAIUI, it is a debug
feature for skiboot. It's QEMU 9.2 material.


Thanks for answering Nick's question, I did not check my mails.

Yes, it can be considered a debug feature.


One little nit is MachineState.fdt vs PnvMachineState.fdt
which is now confusing. I would call the new PnvMachineState member
something like fdt_from_dtb, or fdt_override?


I agree. this is confusing. machine->fdt could be used instead ?


Sure, will use it.


Thanks,

Aditya Gupta




The other question... Some machines rebuild fdt at init, others at
reset time. As far as I understood, spapr has to rebuild on reset
because C-A-S call can update the fdt so you have to undo that on
reset. 


C-A-S is a guest OS hcall. reset is called before the guest OS
is started.


Did powernv just copy that without really needing it, I wonder?
Maybe that rearranged to just do it at init time (e.g., see
hw/riscv/virt.c which is simpler).


The machine is aware of user created devices (on the command line)
only at reset time.

Thanks,

C.






Thanks,
Nick



---
Changelog
===
v3:
  + use 'load_device_tree' to read the device tree, instead of 
g_file_get_contents

  + tested that passed dtb does NOT get ignored on system_reset

v2:
  + move reading dtb and warning to pnv_init

v1:
  + use 'g_file_get_contents' and add check for -append & -dtb as 
suggested by Daniel

---
---
  hw/ppc/pnv.c | 34 ++
  include/hw/ppc/pnv.h |  2 ++
  2 files changed, 32 insertions(+), 4 deletions(-)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 3526852685b4..14225f7e48af 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -736,10 +736,13 @@ static void pnv_reset(MachineState *machine, 
ShutdownCause reason)

  }
  }
  -    fdt = pnv_dt_create(machine);
-
-    /* Pack resulting tree */
-    _FDT((fdt_pack(fdt)));
+    if (pnv->fdt) {
+    fdt = pnv->fdt;
+    } else {
+    fdt = pnv_dt_create(machine);
+    /* Pack resulting tree */
+    _FDT((fdt_pack(fdt)));
+    }
    qemu_fdt_dumpdtb(fdt, fdt_totalsize(fdt));
  cpu_physical_memory_write(PNV_FDT_ADDR, fdt, fdt_totalsize(fdt));
@@ -952,6 +955,14 @@ static void pnv_init(MachineState *machine)
  g_free(sz);
  exit(EXIT_FAILURE);
  }
+
+    /* checks for invalid option combinations */
+    if (machine->dtb && (strlen(machine->kernel_cmdline) != 0)) {
+    error_report("-append and -dtb cannot be used together, as 
passed"

+    " command line is ignored in case of custom dtb");
+    exit(EXIT_FAILURE);
+    }
+
  memory_region_add_subregion(get_system_memory(), 0, 
machine->ram);

    /*
@@ -1003,6 +1014,21 @@ static void pnv_init(MachineState *machine)
  }
  }
  +    /* load dtb if passed */
+    if (machine->dtb) {
+    int fdt_size;
+
+    warn_report("with manually passed dtb, some options like 
'-append'"
+    " will get ignored and the dtb passed will be used 
as-is");

+
+    /* read the file 'machine->dtb', and load it into 'fdt' 
buffer */

+    pnv->fdt = load_device_tree(machine->dtb, &fdt_size);
+    if (!pnv->fdt) {
+    error_report("Could not load dtb '%s'", machine->dtb);
+    exit(1);
+    }
+    }
+
  /* MSIs are supported on this platform */
  msi_nonbroken = true;
  diff --git a/include/hw/ppc/pnv.h b/include/hw/ppc/pnv.h
index fcb6699150c8..20b68fd9264e 100644
--- a/include/hw/ppc/pnv.h
+++ b/include/hw/ppc/pnv.h
@@ -91,6 +91,8 @@ struct PnvMachineState {
  uint32_t initrd_base;
  long initrd_size;
  +    void *fdt;
+
  uint32_t num_chips;
  PnvChip  **chips;








[PATCH v11 1/2] Update subprojects/libvfio-user

2024-08-19 Thread Mattias Nissler
Brings in assorted bug fixes. The following are of particular interest
with respect to message-based DMA support:

* bb308a2 "Fix address calculation for message-based DMA"
  Corrects a bug in DMA address calculation.

* 1569a37 "Pass server->client command over a separate socket pair"
  Adds support for separate sockets for either command direction,
  addressing a bug where libvfio-user gets confused if both client and
  server send commands concurrently.

Reviewed-by: Jagannathan Raman 
Signed-off-by: Mattias Nissler 
---
 subprojects/libvfio-user.wrap | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/subprojects/libvfio-user.wrap b/subprojects/libvfio-user.wrap
index 416955ca45..3dd08768ed 100644
--- a/subprojects/libvfio-user.wrap
+++ b/subprojects/libvfio-user.wrap
@@ -1,4 +1,4 @@
 [wrap-git]
 url = https://gitlab.com/qemu-project/libvfio-user.git
-revision = 0b28d205572c80b568a1003db2c8f37ca333e4d7
+revision = b1a156d86f55a8fa3f78ece5bee7748ec75e7b82
 depth = 1
-- 
2.34.1




[PATCH v11 2/2] vfio-user: Message-based DMA support

2024-08-19 Thread Mattias Nissler
Wire up support for DMA for the case where the vfio-user client does not
provide mmap()-able file descriptors, but DMA requests must be performed
via the VFIO-user protocol. This installs an indirect memory region,
which already works for pci_dma_{read,write}, and pci_dma_map works
thanks to the existing DMA bounce buffering support.

Note that while simple scenarios work with this patch, there's a known
race condition in libvfio-user that will mess up the communication
channel. See https://github.com/nutanix/libvfio-user/issues/279 for
details as well as a proposed fix.

Reviewed-by: Jagannathan Raman 
Signed-off-by: Mattias Nissler 
---
 hw/remote/trace-events|   2 +
 hw/remote/vfio-user-obj.c | 100 --
 2 files changed, 87 insertions(+), 15 deletions(-)

diff --git a/hw/remote/trace-events b/hw/remote/trace-events
index 0d1b7d56a5..358a68fb34 100644
--- a/hw/remote/trace-events
+++ b/hw/remote/trace-events
@@ -9,6 +9,8 @@ vfu_cfg_read(uint32_t offset, uint32_t val) "vfu: cfg: 0x%x -> 
0x%x"
 vfu_cfg_write(uint32_t offset, uint32_t val) "vfu: cfg: 0x%x <- 0x%x"
 vfu_dma_register(uint64_t gpa, size_t len) "vfu: registering GPA 0x%"PRIx64", 
%zu bytes"
 vfu_dma_unregister(uint64_t gpa) "vfu: unregistering GPA 0x%"PRIx64""
+vfu_dma_read(uint64_t gpa, size_t len) "vfu: DMA read 0x%"PRIx64", %zu bytes"
+vfu_dma_write(uint64_t gpa, size_t len) "vfu: DMA write 0x%"PRIx64", %zu bytes"
 vfu_bar_register(int i, uint64_t addr, uint64_t size) "vfu: BAR %d: addr 
0x%"PRIx64" size 0x%"PRIx64""
 vfu_bar_rw_enter(const char *op, uint64_t addr) "vfu: %s request for BAR 
address 0x%"PRIx64""
 vfu_bar_rw_exit(const char *op, uint64_t addr) "vfu: Finished %s of BAR 
address 0x%"PRIx64""
diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c
index 8dbafafb9e..0e93d7a7b4 100644
--- a/hw/remote/vfio-user-obj.c
+++ b/hw/remote/vfio-user-obj.c
@@ -300,6 +300,63 @@ static ssize_t vfu_object_cfg_access(vfu_ctx_t *vfu_ctx, 
char * const buf,
 return count;
 }
 
+static MemTxResult vfu_dma_read(void *opaque, hwaddr addr, uint64_t *val,
+unsigned size, MemTxAttrs attrs)
+{
+MemoryRegion *region = opaque;
+vfu_ctx_t *vfu_ctx = VFU_OBJECT(region->owner)->vfu_ctx;
+uint8_t buf[sizeof(uint64_t)];
+
+trace_vfu_dma_read(region->addr + addr, size);
+
+g_autofree dma_sg_t *sg = g_malloc0(dma_sg_size());
+vfu_dma_addr_t vfu_addr = (vfu_dma_addr_t)(region->addr + addr);
+if (vfu_addr_to_sgl(vfu_ctx, vfu_addr, size, sg, 1, PROT_READ) < 0 ||
+vfu_sgl_read(vfu_ctx, sg, 1, buf) != 0) {
+return MEMTX_ERROR;
+}
+
+*val = ldn_he_p(buf, size);
+
+return MEMTX_OK;
+}
+
+static MemTxResult vfu_dma_write(void *opaque, hwaddr addr, uint64_t val,
+ unsigned size, MemTxAttrs attrs)
+{
+MemoryRegion *region = opaque;
+vfu_ctx_t *vfu_ctx = VFU_OBJECT(region->owner)->vfu_ctx;
+uint8_t buf[sizeof(uint64_t)];
+
+trace_vfu_dma_write(region->addr + addr, size);
+
+stn_he_p(buf, size, val);
+
+g_autofree dma_sg_t *sg = g_malloc0(dma_sg_size());
+vfu_dma_addr_t vfu_addr = (vfu_dma_addr_t)(region->addr + addr);
+if (vfu_addr_to_sgl(vfu_ctx, vfu_addr, size, sg, 1, PROT_WRITE) < 0 ||
+vfu_sgl_write(vfu_ctx, sg, 1, buf) != 0) {
+return MEMTX_ERROR;
+}
+
+return MEMTX_OK;
+}
+
+static const MemoryRegionOps vfu_dma_ops = {
+.read_with_attrs = vfu_dma_read,
+.write_with_attrs = vfu_dma_write,
+.endianness = DEVICE_HOST_ENDIAN,
+.valid = {
+.min_access_size = 1,
+.max_access_size = 8,
+.unaligned = true,
+},
+.impl = {
+.min_access_size = 1,
+.max_access_size = 8,
+},
+};
+
 static void dma_register(vfu_ctx_t *vfu_ctx, vfu_dma_info_t *info)
 {
 VfuObject *o = vfu_get_private(vfu_ctx);
@@ -308,17 +365,30 @@ static void dma_register(vfu_ctx_t *vfu_ctx, 
vfu_dma_info_t *info)
 g_autofree char *name = NULL;
 struct iovec *iov = &info->iova;
 
-if (!info->vaddr) {
-return;
-}
-
 name = g_strdup_printf("mem-%s-%"PRIx64"", o->device,
-   (uint64_t)info->vaddr);
+   (uint64_t)iov->iov_base);
 
 subregion = g_new0(MemoryRegion, 1);
 
-memory_region_init_ram_ptr(subregion, NULL, name,
-   iov->iov_len, info->vaddr);
+if (info->vaddr) {
+memory_region_init_ram_ptr(subregion, OBJECT(o), name,
+   iov->iov_len, info->vaddr);
+} else {
+/*
+ * Note that I/O regions' MemoryRegionOps handle accesses of at most 8
+ * bytes at a time, and larger accesses are broken down. However,
+ * many/most DMA accesses are larger than 8 bytes and VFIO-user can
+ * handle large DMA accesses just fine, thus this size restriction
+ * unnecessarily hurts performance, in particular given that each
+ 

[PATCH v11 0/2] Support message-based DMA in vfio-user server

2024-08-19 Thread Mattias Nissler
This series adds basic support for message-based DMA in qemu's vfio-user
server. This is useful for cases where the client does not provide file
descriptors for accessing system memory via memory mappings. My motivating use
case is to hook up device models as PCIe endpoints to a hardware design. This
works by bridging the PCIe transaction layer to vfio-user, and the endpoint
does not access memory directly, but sends memory requests TLPs to the hardware
design in order to perform DMA.

Note that more work is needed to make message-based DMA work well: qemu
currently breaks down DMA accesses into chunks of size 8 bytes at maximum, each
of which will be handled in a separate vfio-user DMA request message. This is
quite terrible for large DMA accesses, such as when nvme reads and writes
page-sized blocks for example. Thus, I would like to improve qemu to be able to
perform larger accesses, at least for indirect memory regions. I have something
working locally, but since this will likely result in more involved surgery and
discussion, I am leaving this to be addressed in a separate patch.

Changes from v1:

* Address Stefan's review comments. In particular, enforce an allocation limit
  and don't drop the map client callbacks given that map requests can fail when
  hitting size limits.

* libvfio-user version bump now included in the series.

* Tested as well on big-endian s390x. This uncovered another byte order issue
  in vfio-user server code that I've included a fix for.

Changes from v2:

* Add a preparatory patch to make bounce buffering an AddressSpace-specific
  concept.

* The total buffer size limit parameter is now per AdressSpace and can be
  configured for PCIDevice via a property.

* Store a magic value in first bytes of bounce buffer struct as a best effort
  measure to detect invalid pointers in address_space_unmap.

Changes from v3:

* libvfio-user now supports twin-socket mode which uses separate sockets for
  client->server and server->client commands, respectively. This addresses the
  concurrent command bug triggered by server->client DMA access commands. See
  https://github.com/nutanix/libvfio-user/issues/279 for details.

* Add missing teardown code in do_address_space_destroy.

* Fix bounce buffer size bookkeeping race condition.

* Generate unmap notification callbacks unconditionally.

* Some cosmetic fixes.

Changes from v4:

* Fix accidentally dropped memory_region_unref, control flow restored to match
  previous code to simplify review.

* Some cosmetic fixes.

Changes from v5:

* Unregister indirect memory region in libvfio-user dma_unregister callback.

Changes from v6:

* Rebase, resolve straightforward merge conflict in system/dma-helpers.c

Changes from v7:

* Rebase (applied cleanly)

* Restore various Reviewed-by and Tested-by tags that I failed to carry
  forward (I double-checked that the patches haven't changed since the reviewed
  version)

Changes from v8:

* Rebase (clean)

* Change bounce buffer size accounting to use uint32_t so it works also on
  hosts that don't support uint64_t atomics, such as mipsel. As a consequence
  overflows are a real concern now, so switch to a cmpxchg loop for allocating
  bounce buffer space.

Changes from v9:

* Incorporate patch split and QEMU_MUTEX_GUARD change by phi...@linaro.org

* Use size_t instead of uint32_t for bounce buffer size accounting. The qdev
  property remains uint32_t though, so it has a consistent size regardless of
  host.

Changes from v10:

* Update the commit to uprev the libvfio-user subproject to the latest
  libvfio-user revision.

* Break out the "softmmu: Support concurrent bounce buffers" patch so this
  series only touches vfio-user code and can be picked up as is by Jag.

Mattias Nissler (2):
  Update subprojects/libvfio-user
  vfio-user: Message-based DMA support

 hw/remote/trace-events|   2 +
 hw/remote/vfio-user-obj.c | 100 +-
 subprojects/libvfio-user.wrap |   2 +-
 3 files changed, 88 insertions(+), 16 deletions(-)

-- 
2.34.1




Re: [PATCH v3] hw/ppc: Implement -dtb support for PowerNV

2024-08-19 Thread Aditya Gupta

Hello Nick,

On 16/08/24 07:50, Nicholas Piggin wrote:

<...snip...>

One little nit is MachineState.fdt vs PnvMachineState.fdt
which is now confusing. I would call the new PnvMachineState member
something like fdt_from_dtb, or fdt_override?

I agree. this is confusing. machine->fdt could be used instead ?

Yeah that could be another option. Test pnv.dtb or add a new bool
to pnv if you need to check whether the fdt has been provided by
cmdline.


Sure, I will use machine->fdt. Testing pnv.dtb should be good enough to 
check if -dtb was passed I think.



Regarding the conversation about CAS, I don't have idea on it, other 
than the minimum basics. But thanks to you and Cedric, got to know 
somethings.



Thanks,

Aditya Gupta


The other question... Some machines rebuild fdt at init, others at
reset time. As far as I understood, spapr has to rebuild on reset
because C-A-S call can update the fdt so you have to undo that on
reset.

C-A-S is a guest OS hcall. reset is called before the guest OS
is started.

Right, but when you reboot it needs to be reverted to initial
(pre-CAS) fdt.


Did powernv just copy that without really needing it, I wonder?
Maybe that rearranged to just do it at init time (e.g., see
hw/riscv/virt.c which is simpler).

The machine is aware of user created devices (on the command line)
only at reset time.

Ah, I should have followed a bit closer. riscv, arm use a
machine_done notifier for that (and x86, loongarch for ACPI / BIOS
tables). So that avoids fdt rebuild after the first reset I think.

Anyway I don't really mind then, following other archs would be okay,
but keeping similar with spapr and avoiding code change is also good.
Maybe add a small comment to we use reset rather than machine_done
notifier of other archs to be similar to spapr.

Thanks,
Nick




Re: [PATCH v8 13/13] acpi/ghes: check if the BIOS pointers for HEST are correct

2024-08-19 Thread Igor Mammedov
On Fri, 16 Aug 2024 09:37:45 +0200
Mauro Carvalho Chehab  wrote:

> The OS kernels navigate between HEST, error source struct
> and CPER by the usage of some pointers. Double-check if such
> pointers were properly initializing, ensuring that they match
> the right address for CPER.

as QEMU, we don't care about what guest wrote into those addresses
(aka it's not hw businesses), even if later qemu will trample
on wrong guest memory (it's guest responsibility to do init right).

However this patch introduces usage for hest_addr_le, that I was looking for.
See notes below.

> 
> Signed-off-by: Mauro Carvalho Chehab 
> ---
>  hw/acpi/ghes.c | 30 +-
>  1 file changed, 29 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
> index a822a5eafaa0..51e2e40e5a9c 100644
> --- a/hw/acpi/ghes.c
> +++ b/hw/acpi/ghes.c
> @@ -85,6 +85,9 @@ enum AcpiHestSourceId {
>  #define HEST_GHES_V2_TABLE_SIZE  92
>  #define GHES_ACK_OFFSET  (64 + GAS_ADDR_OFFSET + 
> ACPI_HEST_HEADER_SIZE)
>  
> +/* ACPI 6.2: 18.3.2.7: Generic Hardware Error Source */
> +#define GHES_ERR_ST_ADDR_OFFSET  (20 + GAS_ADDR_OFFSET + 
> ACPI_HEST_HEADER_SIZE)
> +
>  /*
>   * Values for error_severity field
>   */
> @@ -425,7 +428,10 @@ NotifierList acpi_generic_error_notifiers =
>  void ghes_record_cper_errors(const void *cper, size_t len,
>   enum AcpiGhesNotifyType notify, Error **errp)
>  {
> -uint64_t cper_addr, read_ack_start_addr;
> +uint64_t hest_read_ack_start_addr, read_ack_start_addr;
> +uint64_t read_ack_start_addr_2, err_source_struct;
> +uint64_t hest_err_block_addr, error_block_addr;
> +uint64_t cper_addr, cper_addr_2;
>  enum AcpiHestSourceId source;
>  AcpiGedState *acpi_ged_state;
>  AcpiGhesState *ags;
> @@ -450,6 +456,28 @@ void ghes_record_cper_errors(const void *cper, size_t 
> len,
>  cper_addr += ACPI_HEST_SRC_ID_COUNT * sizeof(uint64_t);
>  cper_addr += source * ACPI_GHES_MAX_RAW_DATA_LENGTH;
>  
> +err_source_struct = le64_to_cpu(ags->hest_addr_le) +
> +source * HEST_GHES_V2_TABLE_SIZE;

there is no guaranties that HEST table will contain only GHESv2 sources,
and once such is added this place becomes broken.

we need to iterate over HEST taking that into account
and find only ghesv2 structure with source id of interest.

This function (and acpi_ghes_record_errors() as well) taking source_id
as input should be able to lookup pointers from HEST in guest RAM,
very crude idea could look something like this:

typedef struct hest_source_type2len{
   uint16_t type
   int len
} hest_structure_type2len

hest_structure_type2len supported_hest_sources[] = {
/* Table 18-344 Generic Hardware Error Source version 2 (GHESv2) Structure 
*/
{.type = 10, .len = 92},
}

uint64_t find_error_source(src_id) {
uint32_t struct_offset = hest_header_size;
uint16_t type, id
do {
   addr = ags->hest_addr_le + struct_offset
 
   cpu_physical_memory_read(addr, &id)
   if (src_id == id)
 return addr

   cpu_physical_memory_read(addr, &type)
   struct_offset ++= get_len_from_supported_hest_sources(type)
while(struct_offset < hest_len)
assert if not found
}

unit64_t get_error_status_block_addr(src_id) {
   struct_addr = find_error_source(src_id) 
   hest_err_block_addr =   struct_addr + GHES_ERR_ST_ADDR_OFFSET
   // read intermediate pointer to status block addr pointer in hw table
   cpu_physical_memory_read(hest_err_block_addr, &error_block_addr)
   // read actual pointer to status block
   cpu_physical_memory_read(error_block_addr, &error_status_block_addr)
   return error_status_block_addr
}
 
ditto for read_ack modulo indirection that we have for error_status_block_addr

This way we can easily map source id to error status block
and find needed addresses using pointer info from guest RAM
without fragile pointer math and assumptions which might go wrong
when new error sources are added and regardless of the order they
are being added.

> +/* Check if BIOS addr pointers were properly generated */
> +
> +hest_err_block_addr = err_source_struct + GHES_ERR_ST_ADDR_OFFSET;
> +hest_read_ack_start_addr = err_source_struct + GHES_ACK_OFFSET;
> +
> +cpu_physical_memory_read(hest_err_block_addr, &error_block_addr,
> + sizeof(error_block_addr));
> +
> +cpu_physical_memory_read(error_block_addr, &cper_addr_2,
> + sizeof(error_block_addr));
> +
> +cpu_physical_memory_read(hest_read_ack_start_addr, 
> &read_ack_start_addr_2,
> +  sizeof(read_ack_start_addr_2));
> +
> +assert(cper_addr == cper_addr_2);
> +assert(read_ack_start_addr == read_ack_start_addr_2);
> +
> +/* Update ACK offset to notify about a new error */
> +
>  cpu_physical_memory_read(read_ack_start_addr,
>   &read_ack, sizeof(uint64_t));
>  




Re: [PATCH v8 00/13] Add ACPI CPER firmware first error injection on ARM emulation

2024-08-19 Thread Igor Mammedov
On Fri, 16 Aug 2024 09:37:32 +0200
Mauro Carvalho Chehab  wrote:

> This series add support for injecting generic CPER records.  Such records
> are generated outside QEMU via a provided script.
> 
> On this version, I added two optional patches at the end:
> - acpi/ghes: cleanup generic error data logic
> 
>   It drops some obvious comments from some already-existing code.
>   As we're already doing lots of changes at the code, it sounded
>   reasonable to me to have such cleanup here;
> 
> - acpi/ghes: check if the BIOS pointers for HEST are correct
> 
>   QEMU has two ways to navigate to a CPER start data: via its
>   memory address or indirectly following 2 BIOS pointers.
>   OS only have the latter one. This patch validates if the BIOS
>   links used by the OS were properly produced, comparing to the
>   actual location of the CPER record.

I went over the series,
once suggestion in 13/13 implemented
we can get rid of pointer math that is reshuffled several times
in patches here.

I'd suggest to structure series as following:
 
  1: patch that adds hest_addr_le
  2: refactoring current code to use address lookup vs pointer math
  3. renaming patches 
  4. patch adding new error source
  5. QAPI patch
  6. python script for error injection

with that in place we probably would need to
  * iron out minor migration compat issues
(I didn't look for them during this review round as much
 would change yet)
  * make sure that bios tables test is updated

> 
> ---
> 
> v8:
> - Fix one of the BIOS links that were incorrect;
> - Changed mem error internal injection to use a common code;
> - No more hardcoded values for CPER: instead of using just the
>   payload at the QAPI, it now has the full raw CPER there;
> - Error injection script now supports changing fields at the
>   Generic Error Data section of the CPER;
> - Several minor cleanups.
> 
> v7:
> - Change the way offsets are calculated and used on HEST table.
>   Now, it is compatible with migrations as all offsets are relative
>   to the HEST table;
> - GHES interface is now more generic: the entire CPER is sent via
>   QMP, instead of just the payload;
> - Some code cleanups to make the code more robust;
> - The python script now uses QEMUMonitorProtocol class.
> 
> v6:
> - PNP0C33 device creation moved to aml-build.c;
> - acpi_ghes record functions now use ACPI notify parameter,
>   instead of source ID;
> - the number of source IDs is now automatically calculated;
> - some code cleanups and function/var renames;
> - some fixes and cleanups at the error injection script;
> - ghes cper stub now produces an error if cper JSON is not compiled;
> - Offset calculation logic for GHES was refactored;
> - Updated documentation to reflect the GHES allocated size;
> - Added a x-mpidr object for QOM usage;
> - Added a patch making usage of x-mpidr field at ARM injection
>   script;
> 
> v5:
> - CPER guid is now passing as string;
> - raw-data is now passed with base64 encode;
> - Removed several GPIO left-overs from arm/virt.c changes;
> - Lots of cleanups and improvements at the error injection script.
>   It now better handles QMP dialog and doesn't print debug messages.
>   Also, code was split on two modules, to make easier to add more
>   error injection commands.
> 
> v4:
> - CPER generation moved to happen outside QEMU;
> - One patch adding support for mpidr query was removed.
> 
> v3:
> - patch 1 cleanups with some comment changes and adding another place where
>   the poweroff GPIO define should be used. No changes on other patches (except
>   due to conflict resolution).
> 
> v2:
> - added a new patch using a define for GPIO power pin;
> - patch 2 changed to also use a define for generic error GPIO pin;
> - a couple cleanups at patch 2 removing uneeded else clauses.
> 
> Example of generating a CPER record:
> 
> $ scripts/ghes_inject.py -d arm -p 0xdeadbeef
> GUID: e19e3d16-bc11-11e4-9caa-c2051d5d46b0
> Generic Error Status Block (20 bytes):
>     01 00 00 00 00 00 00 00 00 00 00 00 90 00 00 00   
> 
>   0010  00 00 00 00   
> 
> Generic Error Data Entry (72 bytes):
>     16 3d 9e e1 11 bc e4 11 9c aa c2 05 1d 5d 46 b0   
> .=...]F.
>   0010  00 00 00 00 00 03 00 00 48 00 00 00 00 00 00 00   
> H...
>   0020  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   
> 
>   0030  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   
> 
>   0040  00 00 00 00 00 00 00 00   
> 
> Payload (72 bytes):
>     05 00 00 00 01 00 00 00 48 00 00 00 00 00 00 00   
> H...
>   0010  00 00 00 80 00 00 00 00 10 05 0f 00 00 00 00 00   
> 
>   0020  00 00 00 00 00 00 00 00 00 20 14 00 02 01 00 03   . 
> ..
>   0030  0f 00 91 00 00 00 00 00 ef be ad de 00 00 00 00   
> 
>   00

[PATCH] docs/system/cpu-hotplug: Update example's socket-id/core-id

2024-08-19 Thread Peter Maydell
At some point the way we allocate socket-id and core-id to CPUs
by default changed; update the example of how to do CPU hotplug
and unplug so the example commands work again. The differences
in the sample input and output are:
 * the second CPU is now socket-id=0 core-id=1,
   not socket-id=1 core-id=0
 * the order of fields from the qmp_shell is different (it seems
   to now always be in alphabetical order)

Signed-off-by: Peter Maydell 
---
I noticed this while I was playing around with vcpu hotplug trying to
demonstrate a memory leak I want to fix...

 docs/system/cpu-hotplug.rst | 54 ++---
 1 file changed, 26 insertions(+), 28 deletions(-)

diff --git a/docs/system/cpu-hotplug.rst b/docs/system/cpu-hotplug.rst
index 015ce2b6ec3..443ff226b90 100644
--- a/docs/system/cpu-hotplug.rst
+++ b/docs/system/cpu-hotplug.rst
@@ -33,23 +33,23 @@ vCPU hotplug
   {
   "return": [
   {
-  "type": "IvyBridge-IBRS-x86_64-cpu",
-  "vcpus-count": 1,
   "props": {
-  "socket-id": 1,
-  "core-id": 0,
+  "core-id": 1,
+  "socket-id": 0,
   "thread-id": 0
-  }
+  },
+  "type": "IvyBridge-IBRS-x86_64-cpu",
+  "vcpus-count": 1
   },
   {
+  "props": {
+  "core-id": 0,
+  "socket-id": 0,
+  "thread-id": 0
+  },
   "qom-path": "/machine/unattached/device[0]",
   "type": "IvyBridge-IBRS-x86_64-cpu",
-  "vcpus-count": 1,
-  "props": {
-  "socket-id": 0,
-  "core-id": 0,
-  "thread-id": 0
-  }
+  "vcpus-count": 1
   }
   ]
   }
@@ -58,18 +58,18 @@ vCPU hotplug
 (4) The ``query-hotpluggable-cpus`` command returns an object for CPUs
 that are present (containing a "qom-path" member) or which may be
 hot-plugged (no "qom-path" member).  From its output in step (3), we
-can see that ``IvyBridge-IBRS-x86_64-cpu`` is present in socket 0,
-while hot-plugging a CPU into socket 1 requires passing the listed
+can see that ``IvyBridge-IBRS-x86_64-cpu`` is present in socket 0 core 0,
+while hot-plugging a CPU into socket 0 core 1 requires passing the listed
 properties to QMP ``device_add``::
 
   (QEMU) device_add id=cpu-2 driver=IvyBridge-IBRS-x86_64-cpu socket-id=1 
core-id=0 thread-id=0
   {
   "execute": "device_add",
   "arguments": {
-  "socket-id": 1,
+  "core-id": 1,
   "driver": "IvyBridge-IBRS-x86_64-cpu",
   "id": "cpu-2",
-  "core-id": 0,
+  "socket-id": 0,
   "thread-id": 0
   }
   }
@@ -83,34 +83,32 @@ vCPU hotplug
 
   (QEMU) query-cpus-fast
   {
-  "execute": "query-cpus-fast",
   "arguments": {}
+  "execute": "query-cpus-fast",
   }
   {
   "return": [
   {
-  "qom-path": "/machine/unattached/device[0]",
-  "target": "x86_64",
-  "thread-id": 11534,
   "cpu-index": 0,
   "props": {
-  "socket-id": 0,
   "core-id": 0,
+  "socket-id": 0,
   "thread-id": 0
   },
-  "arch": "x86"
+  "qom-path": "/machine/unattached/device[0]",
+  "target": "x86_64",
+  "thread-id": 28957
   },
   {
-  "qom-path": "/machine/peripheral/cpu-2",
-  "target": "x86_64",
-  "thread-id": 12106,
   "cpu-index": 1,
   "props": {
-  "socket-id": 1,
-  "core-id": 0,
+  "core-id": 1,
+  "socket-id": 0,
   "thread-id": 0
   },
-  "arch": "x86"
+  "qom-path": "/machine/peripheral/cpu-2",
+  "target": "x86_64",
+  "thread-id": 29095
   }
   ]
   }
@@ -123,10 +121,10 @@ From the 'qmp-shell', invoke the QMP ``device_del`` 
command::
 
   (QEMU) device_del id=cpu-2
   {
-  "execute": "device_del",
   "arguments": {
   "id": "cpu-2"
   }
+  "execute": "device_del",
   }
   {
   "return": {}
-- 
2.34.1




[PATCH] crypto/tlscredspsk: Free username on finalize

2024-08-19 Thread Peter Maydell
When the creds->username property is set we allocate memory
for it in qcrypto_tls_creds_psk_prop_set_username(), but
we never free this when the QCryptoTLSCredsPSK is destroyed.
Free the memory in finalize.

This fixes a LeakSanitizer complaint in migration-test:

$ (cd build/asan; ASAN_OPTIONS="fast_unwind_on_malloc=0" 
QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/migration-test --tap -k -p 
/x86_64/migration/precopy/unix/tls/psk)

=
==3867512==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 5 byte(s) in 1 object(s) allocated from:
#0 0x5624e5c99dee in malloc 
(/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qemu-system-x86_64+0x218edee)
 (BuildId: a9e623fa1009a9435c0142c037cd7b8c1ad04ce3)
#1 0x7fb199ae9738 in g_malloc debian/build/deb/../../../glib/gmem.c:128:13
#2 0x7fb199afe583 in g_strdup 
debian/build/deb/../../../glib/gstrfuncs.c:361:17
#3 0x5624e82ea919 in qcrypto_tls_creds_psk_prop_set_username 
/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../crypto/tlscredspsk.c:255:23
#4 0x5624e812c6b5 in property_set_str 
/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qom/object.c:2277:5
#5 0x5624e8125ce5 in object_property_set 
/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qom/object.c:1463:5
#6 0x5624e8136e7c in object_set_properties_from_qdict 
/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qom/object_interfaces.c:55:14
#7 0x5624e81372d2 in user_creatable_add_type 
/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qom/object_interfaces.c:112:5
#8 0x5624e8137964 in user_creatable_add_qapi 
/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qom/object_interfaces.c:157:11
#9 0x5624e891ba3c in qmp_object_add 
/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qom/qom-qmp-cmds.c:227:5
#10 0x5624e8af9118 in qmp_marshal_object_add 
/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qapi/qapi-commands-qom.c:337:5
#11 0x5624e8bd1d49 in do_qmp_dispatch_bh 
/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qapi/qmp-dispatch.c:128:5
#12 0x5624e8cb2531 in aio_bh_call 
/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/async.c:171:5
#13 0x5624e8cb340c in aio_bh_poll 
/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/async.c:218:13
#14 0x5624e8c0be98 in aio_dispatch 
/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/aio-posix.c:423:5
#15 0x5624e8cba3ce in aio_ctx_dispatch 
/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/async.c:360:5
#16 0x7fb199ae0d3a in g_main_dispatch 
debian/build/deb/../../../glib/gmain.c:3419:28
#17 0x7fb199ae0d3a in g_main_context_dispatch 
debian/build/deb/../../../glib/gmain.c:4137:7
#18 0x5624e8cbe1d9 in glib_pollfds_poll 
/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/main-loop.c:287:9
#19 0x5624e8cbcb13 in os_host_main_loop_wait 
/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/main-loop.c:310:5
#20 0x5624e8cbc6dc in main_loop_wait 
/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/main-loop.c:589:11
#21 0x5624e6f3f917 in qemu_main_loop 
/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../system/runstate.c:801:9
#22 0x5624e893379c in qemu_default_main 
/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../system/main.c:37:14
#23 0x5624e89337e7 in main 
/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../system/main.c:48:12
#24 0x7fb197972d8f in __libc_start_call_main 
csu/../sysdeps/nptl/libc_start_call_main.h:58:16
#25 0x7fb197972e3f in __libc_start_main csu/../csu/libc-start.c:392:3
#26 0x5624e5c16fa4 in _start 
(/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qemu-system-x86_64+0x210bfa4)
 (BuildId: a9e623fa1009a9435c0142c037cd7b8c1ad04ce3)

SUMMARY: AddressSanitizer: 5 byte(s) leaked in 1 allocation(s).

Cc: qemu-sta...@nongnu.org
Signed-off-by: Peter Maydell 
---
Found this playing around with the address-sanitizer and running
"make check".  I guess this is stable material but maybe not
important enough to go into 9.1 at this point in the cycle, since the
bug has been present since the code was first written in 2018.

 crypto/tlscredspsk.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/crypto/tlscredspsk.c b/crypto/tlscredspsk.c
index 546cad1c5a4..0d6b71a37cf 100644
--- a/crypto/tlscredspsk.c
+++ b/crypto/tlscredspsk.c
@@ -243,6 +243,7 @@ qcrypto_tls_creds_psk_finalize(Object *obj)
 QCryptoTLSCredsPSK *creds = QCRYPTO_TLS_CREDS_PSK(obj);
 
 qcrypto_tls_creds_psk_unload(creds);
+g_free(creds->username);
 }
 
 static void
-- 
2.34.1




[PATCH] MAINTAINERS: Remove myself as reviewer

2024-08-19 Thread Beraldo Leal
Finally taking this off my to-do list. It’s been a privilege to be part
of this project, but I am no longer actively involved in reviewing
Python code here, so I believe it's best to update the list to reflect
the current maintainers.

Please, feel free to reach out if any questions arise.

Signed-off-by: Beraldo Leal 
---
 MAINTAINERS | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 3584d6a6c6..806cf0884d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3174,7 +3174,6 @@ F: qapi/cryptodev.json
 Python library
 M: John Snow 
 M: Cleber Rosa 
-R: Beraldo Leal 
 S: Maintained
 F: python/
 T: git https://gitlab.com/jsnow/qemu.git python
@@ -4121,7 +4120,6 @@ M: Alex Bennée 
 M: Philippe Mathieu-Daudé 
 M: Thomas Huth 
 R: Wainer dos Santos Moschetta 
-R: Beraldo Leal 
 S: Maintained
 F: .github/workflows/lockdown.yml
 F: .gitlab-ci.yml
@@ -4162,7 +4160,6 @@ W: https://trello.com/b/6Qi1pxVn/avocado-qemu
 R: Cleber Rosa 
 R: Philippe Mathieu-Daudé 
 R: Wainer dos Santos Moschetta 
-R: Beraldo Leal 
 S: Odd Fixes
 F: tests/avocado/
 
-- 
2.40.0




Re: [PULL v2 42/61] physmem: Add helper function to destroy CPU AddressSpace

2024-08-19 Thread Peter Maydell
On Tue, 23 Jul 2024 at 11:59, Michael S. Tsirkin  wrote:
>
> From: Salil Mehta 
>
> Virtual CPU Hot-unplug leads to unrealization of a CPU object. This also
> involves destruction of the CPU AddressSpace. Add common function to help
> destroy the CPU AddressSpace.

Based on some testing I've been doing that tries to use
(a variation of) this function to do the cleanup of the
CPU address spaces, I think there's a problem with it.
(This doesn't matter for 9.1 because nothing calls this
function as yet.)

> +void cpu_address_space_destroy(CPUState *cpu, int asidx)
> +{
> +CPUAddressSpace *cpuas;
> +
> +assert(cpu->cpu_ases);
> +assert(asidx >= 0 && asidx < cpu->num_ases);
> +/* KVM cannot currently support multiple address spaces. */
> +assert(asidx == 0 || !kvm_enabled());
> +
> +cpuas = &cpu->cpu_ases[asidx];
> +if (tcg_enabled()) {
> +memory_listener_unregister(&cpuas->tcg_as_listener);
> +}
> +
> +address_space_destroy(cpuas->as);
> +g_free_rcu(cpuas->as, rcu);

RCU doesn't guarantee the order in which it executes the
rcu reclaim hooks, so we can run the g_free() of cpuas-as
*before* the do_address_space_destroy hook that
address_space_destroy() sets up. This means we free the
RCU node that the latter hook is using, and then
do_address_space_destroy is never called (and I think also
I was seeing the RCU callback thread get stalled entirely,
because the list node it wanted to traverse was garbage.)

However, I don't understand how to fix this -- how is a
caller of address_space_destroy() supposed to know when it
can free the memory containing the AddressSpace ?
Paolo: do you understand how this should work? We seem
to already use address_space_destroy() in various places
usually for an AS that's embedded in a device struct --
how do we ensure that the destroy has finished before we
free the device memory ?

> +
> +if (asidx == 0) {
> +/* reset the convenience alias for address space 0 */
> +cpu->as = NULL;
> +}
> +
> +if (--cpu->cpu_ases_count == 0) {
> +g_free(cpu->cpu_ases);
> +cpu->cpu_ases = NULL;
> +}
> +}

thanks
-- PMM



Re: [PATCH v4 4/6] machine/nitro-enclave: Add built-in Nitro Secure Module device

2024-08-19 Thread Dorjoy Chowdhury
Hey Alex,

On Mon, Aug 19, 2024 at 4:13 PM Alexander Graf  wrote:
>
> Hey Dorjoy,
>
> On 18.08.24 13:42, Dorjoy Chowdhury wrote:
> > AWS Nitro Enclaves have built-in Nitro Secure Module (NSM) device which
> > is used for stripped down TPM functionality like attestation. This commit
> > adds the built-in NSM device in the nitro-enclave machine type.
> >
> > In Nitro Enclaves, all the PCRs start in a known zero state and the first
> > 16 PCRs are locked from boot and reserved. The PCR0, PCR1, PCR2 and PCR8
> > contain the SHA384 hashes related to the EIF file used to boot the
> > VM for validation.
> >
> > Some optional nitro-enclave machine options have been added:
> >  - 'id': Enclave identifier, reflected in the module-id of the NSM
> > device. If not provided, a default id will be set.
> >  - 'parent-role': Parent instance IAM role ARN, reflected in PCR3
> > of the NSM device.
> >  - 'parent-id': Parent instance identifier, reflected in PCR4 of the
> > NSM device.
> >
> > Signed-off-by: Dorjoy Chowdhury 
> > ---
> >   crypto/meson.build  |   2 +-
> >   crypto/x509-utils.c |  73 +++
>
>
> Can you please put this new API into its own patch file?
>
>
> >   hw/core/eif.c   | 225 +---
> >   hw/core/eif.h   |   5 +-
>
>
> These changes to eif.c should ideally already be part of the patch that
> introduces eif.c (patch 1), no? In fact, do you think you can make the
> whole eif logic its own patch file?
>

Good point. I guess it should be possible if I have the virtio-nsm
device commit first and then add the machine/nitro-enclave commit with
full support with the devices. That will of course make the
machine/nitro-enclave commit larger. What do you think?

Regards,
Dorjoy



Re: [PATCH] crypto/tlscredspsk: Free username on finalize

2024-08-19 Thread Daniel P . Berrangé
On Mon, Aug 19, 2024 at 03:50:21PM +0100, Peter Maydell wrote:
> When the creds->username property is set we allocate memory
> for it in qcrypto_tls_creds_psk_prop_set_username(), but
> we never free this when the QCryptoTLSCredsPSK is destroyed.
> Free the memory in finalize.
> 
> This fixes a LeakSanitizer complaint in migration-test:
> 
> $ (cd build/asan; ASAN_OPTIONS="fast_unwind_on_malloc=0" 
> QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/migration-test --tap -k 
> -p /x86_64/migration/precopy/unix/tls/psk)
> 
> =
> ==3867512==ERROR: LeakSanitizer: detected memory leaks
> 
> Direct leak of 5 byte(s) in 1 object(s) allocated from:
> #0 0x5624e5c99dee in malloc 
> (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qemu-system-x86_64+0x218edee)
>  (BuildId: a9e623fa1009a9435c0142c037cd7b8c1ad04ce3)
> #1 0x7fb199ae9738 in g_malloc debian/build/deb/../../../glib/gmem.c:128:13
> #2 0x7fb199afe583 in g_strdup 
> debian/build/deb/../../../glib/gstrfuncs.c:361:17
> #3 0x5624e82ea919 in qcrypto_tls_creds_psk_prop_set_username 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../crypto/tlscredspsk.c:255:23
> #4 0x5624e812c6b5 in property_set_str 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qom/object.c:2277:5
> #5 0x5624e8125ce5 in object_property_set 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qom/object.c:1463:5
> #6 0x5624e8136e7c in object_set_properties_from_qdict 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qom/object_interfaces.c:55:14
> #7 0x5624e81372d2 in user_creatable_add_type 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qom/object_interfaces.c:112:5
> #8 0x5624e8137964 in user_creatable_add_qapi 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qom/object_interfaces.c:157:11
> #9 0x5624e891ba3c in qmp_object_add 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qom/qom-qmp-cmds.c:227:5
> #10 0x5624e8af9118 in qmp_marshal_object_add 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qapi/qapi-commands-qom.c:337:5
> #11 0x5624e8bd1d49 in do_qmp_dispatch_bh 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qapi/qmp-dispatch.c:128:5
> #12 0x5624e8cb2531 in aio_bh_call 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/async.c:171:5
> #13 0x5624e8cb340c in aio_bh_poll 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/async.c:218:13
> #14 0x5624e8c0be98 in aio_dispatch 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/aio-posix.c:423:5
> #15 0x5624e8cba3ce in aio_ctx_dispatch 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/async.c:360:5
> #16 0x7fb199ae0d3a in g_main_dispatch 
> debian/build/deb/../../../glib/gmain.c:3419:28
> #17 0x7fb199ae0d3a in g_main_context_dispatch 
> debian/build/deb/../../../glib/gmain.c:4137:7
> #18 0x5624e8cbe1d9 in glib_pollfds_poll 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/main-loop.c:287:9
> #19 0x5624e8cbcb13 in os_host_main_loop_wait 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/main-loop.c:310:5
> #20 0x5624e8cbc6dc in main_loop_wait 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/main-loop.c:589:11
> #21 0x5624e6f3f917 in qemu_main_loop 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../system/runstate.c:801:9
> #22 0x5624e893379c in qemu_default_main 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../system/main.c:37:14
> #23 0x5624e89337e7 in main 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../system/main.c:48:12
> #24 0x7fb197972d8f in __libc_start_call_main 
> csu/../sysdeps/nptl/libc_start_call_main.h:58:16
> #25 0x7fb197972e3f in __libc_start_main csu/../csu/libc-start.c:392:3
> #26 0x5624e5c16fa4 in _start 
> (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qemu-system-x86_64+0x210bfa4)
>  (BuildId: a9e623fa1009a9435c0142c037cd7b8c1ad04ce3)
> 
> SUMMARY: AddressSanitizer: 5 byte(s) leaked in 1 allocation(s).
> 
> Cc: qemu-sta...@nongnu.org
> Signed-off-by: Peter Maydell 
> ---
> Found this playing around with the address-sanitizer and running
> "make check".  I guess this is stable material but maybe not
> important enough to go into 9.1 at this point in the cycle, since the
> bug has been present since the code was first written in 2018.

The memory leak is low impact since credentials either live for the
entire of the QEMU lifetime, or sometimes are created & deleted on
the fly for infrequent operations like live migrate.

> 
>  crypto/tlscredspsk.c | 1 +
>  1 file changed, 1 insertion(+)

Reviewed-by: Daniel P. Berrangé 

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-h

Re: [PATCH v4 4/6] machine/nitro-enclave: Add built-in Nitro Secure Module device

2024-08-19 Thread Dorjoy Chowdhury
On Mon, Aug 19, 2024 at 4:13 PM Alexander Graf  wrote:
>
> Hey Dorjoy,
>
> On 18.08.24 13:42, Dorjoy Chowdhury wrote:
> > AWS Nitro Enclaves have built-in Nitro Secure Module (NSM) device which
> > is used for stripped down TPM functionality like attestation. This commit
> > adds the built-in NSM device in the nitro-enclave machine type.
> >
> > In Nitro Enclaves, all the PCRs start in a known zero state and the first
> > 16 PCRs are locked from boot and reserved. The PCR0, PCR1, PCR2 and PCR8
> > contain the SHA384 hashes related to the EIF file used to boot the
> > VM for validation.
> >
> > Some optional nitro-enclave machine options have been added:
> >  - 'id': Enclave identifier, reflected in the module-id of the NSM
> > device. If not provided, a default id will be set.
> >  - 'parent-role': Parent instance IAM role ARN, reflected in PCR3
> > of the NSM device.
> >  - 'parent-id': Parent instance identifier, reflected in PCR4 of the
> > NSM device.
> >
> > Signed-off-by: Dorjoy Chowdhury 
> > ---
> >   crypto/meson.build  |   2 +-
> >   crypto/x509-utils.c |  73 +++
>
>
> Can you please put this new API into its own patch file?
>
>
> >   hw/core/eif.c   | 225 +---
> >   hw/core/eif.h   |   5 +-
>
>
> These changes to eif.c should ideally already be part of the patch that
> introduces eif.c (patch 1), no? In fact, do you think you can make the
> whole eif logic its own patch file?
>
>
> >   hw/core/meson.build |   4 +-
> >   hw/i386/Kconfig |   1 +
> >   hw/i386/nitro_enclave.c | 141 +++-
> >   include/crypto/x509-utils.h |  22 
> >   include/hw/i386/nitro_enclave.h |  26 
> >   9 files changed, 479 insertions(+), 20 deletions(-)
> >   create mode 100644 crypto/x509-utils.c
> >   create mode 100644 include/crypto/x509-utils.h
> >
> > diff --git a/crypto/meson.build b/crypto/meson.build
> > index c46f9c22a7..09633194ed 100644
> > --- a/crypto/meson.build
> > +++ b/crypto/meson.build
> > @@ -62,7 +62,7 @@ endif
> >   if gcrypt.found()
> > util_ss.add(gcrypt, files('random-gcrypt.c'))
> >   elif gnutls.found()
> > -  util_ss.add(gnutls, files('random-gnutls.c'))
> > +  util_ss.add(gnutls, files('random-gnutls.c', 'x509-utils.c'))
>
>
> What if we don't have gnutls. Will everything still compile or do we
> need to add any dependencies?
>
>

[...]

> >
> > diff --git a/hw/core/meson.build b/hw/core/meson.build
> > index f32d1ad943..8dc4552e35 100644
> > --- a/hw/core/meson.build
> > +++ b/hw/core/meson.build
> > @@ -12,6 +12,8 @@ hwcore_ss.add(files(
> > 'qdev-clock.c',
> >   ))
> >
> > +libcbor = dependency('libcbor', version: '>=0.7.0')
> > +
> >   common_ss.add(files('cpu-common.c'))
> >   common_ss.add(files('machine-smp.c'))
> >   system_ss.add(when: 'CONFIG_FITLOADER', if_true: files('loader-fit.c'))
> > @@ -24,7 +26,7 @@ system_ss.add(when: 'CONFIG_REGISTER', if_true: 
> > files('register.c'))
> >   system_ss.add(when: 'CONFIG_SPLIT_IRQ', if_true: files('split-irq.c'))
> >   system_ss.add(when: 'CONFIG_XILINX_AXI', if_true: files('stream.c'))
> >   system_ss.add(when: 'CONFIG_PLATFORM_BUS', if_true: files('sysbus-fdt.c'))
> > -system_ss.add(when: 'CONFIG_NITRO_ENCLAVE', if_true: [files('eif.c'), 
> > zlib])
> > +system_ss.add(when: 'CONFIG_NITRO_ENCLAVE', if_true: [files('eif.c'), 
> > zlib, libcbor, gnutls])
>
>
> Ah, you add the gnutls dependency here. Great! However, this means we
> now make gnutls (and libcbor) a mandatory dependency for the default
> configuration. Does configure know about that? I believe before gnutls
> was optional, right?
>

I see gnutls is not a required dependency in the root meson.build. I
am not sure what we should do here.

Hey Daniel, do you have any suggestions about how this dependency
should be included?

Regards,
Dorjoy



Re: [PATCH v2] i386/cpu: Introduce enable_cpuid_0x1f to force exposing CPUID 0x1f

2024-08-19 Thread Igor Mammedov
On Wed, 14 Aug 2024 00:39:57 +0800
Xiaoyao Li  wrote:

> On 8/13/2024 10:51 PM, Xiaoyao Li wrote:
> > On 8/13/2024 5:27 PM, Igor Mammedov wrote:  
> >> On Mon, 12 Aug 2024 23:31:45 -0400
> >> Xiaoyao Li  wrote:
> >>  
> >>> Currently, QEMU exposes CPUID 0x1f to guest only when necessary, i.e.,
> >>> when topology level that cannot be enumerated by leaf 0xB, e.g., die or
> >>> module level, are configured for the guest, e.g., -smp xx,dies=2.
> >>>
> >>> However, 1) TDX architecture forces to require CPUID 0x1f to 
> >>> configure CPU
> >>> topology. and 2) There is a bug in Windows that Windows 10/11 expects 
> >>> valid
> >>> 0x1f leafs when the maximum basic leaf > 0x1f[1].  
> >>   1. will it boot if you use older cpu model?  
> > 
> > It can boot with any cpu model that has .level < 0x1f.  
> 
> I realize just now that we don't need to introduce "x-cpuid-1f" as the 
> workaround for buggy windows. We can always workaround it by limiting 
> the maximum basic CPUID leaf to less than 0x1f, i.e., -cpu xxx,level=0x1e

I'd suggest to add this to change log into 'known issues' section.
(I mean Windows bug symptoms and suggested workaround)  

> 
> I think we can ignore this patch for now. I will re-submit it with TDX 
> enabling series, and with "x-cpuid-1f" interface removed.
> 
> >>   2. how user would know that this option would be needed?  
> > 
> > Honestly, I don't have an answer for it.
> > 
> > I'm not sure if it is the duty of QEMU to identify this case and print 
> > some hint to user. It's the bug of Windows, maybe Mircosoft should put 
> > something in their known bugs list for users?
I guess you've answered your own question alredy we have a workaround
and there is not need for yet another option that user won't know how to use.

As for configuring workaround, it's upto upper layers which know what OS
would be running in VM.  




Re: [RFC] Virtualizing tagged disaggregated memory capacity (app specific, multi host shared)

2024-08-19 Thread Jonathan Cameron
On Sun, 18 Aug 2024 21:12:34 -0500
John Groves  wrote:

> On 24/08/15 05:22PM, Jonathan Cameron wrote:
> > Introduction
> > 
> > 
> > If we think application specific memory (including inter-host shared 
> > memory) is
> > a thing, it will also be a thing people want to use with virtual machines,
> > potentially nested. So how do we present it at the Host to VM boundary?
> > 
> > This RFC is perhaps premature given we haven't yet merged upstream support 
> > for
> > the bare metal case. However I'd like to get the discussion going given 
> > we've
> > touched briefly on this in a number of CXL sync calls and it is clear no 
> > one is  
> 
> Excellent write-up, thanks Jonathan.
> 
> Hannes' idea of an in-person discussion at LPC is a great idea - count me in.

Had a feeling you might say that ;)

> 
> As the proprietor of famfs [1] I have many thoughts.
> 
> First, I like the concept of application-specific memory (ASM), but I wonder
> if there might be a better term for it. ASM suggests that there is one
> application, but I'd suggest that a more concise statement of the concept
> is that the Linux kernel never accesses or mutates the memory - even though
> multiple apps might share it (e.g. via famfs). It's a subtle point, but
> an important one for RAS etc. ASM might better be called non-kernel-managed
> memory - though that name does not have as good a ring to it. Will mull this
> over further...

Naming is always the hard bit :)  I agree that one doesn't work for
shared capacity. You can tell I didn't start there :)

> 
> Now a few level-setting comments on CXL and Dynamic Capacity Devices (DCDs),
> some of which will be obvious to many of you:
> 
> * A DCD is just a memory device with an allocator and host-level
>   access-control built in.
> * Usable memory from a DCD is not available until the fabric manger (likely
>   on behalf of an orchestrator) performs an Initiate Dynamic Capacity Add
>   command to the DCD.
> * A DCD allocation has a tag (uuid) which is the invariant way of identifying
>   the memory from that allocation.
> * The tag becomes known to the host from the DCD extents provided via
>   a CXL event following succesful allocation.
> * The memory associated with a tagged allocation will surface as a dax device
>   on each host that has access to it. But of course dax device naming &
>   numbering won't be consistent across separate hosts - so we need to use
>   the uuid's to find specific memory.
> 
> A few less foundational observations:
> 
> * It does not make sense to "online" shared or sharable memory as system-ram,
>   because system-ram gets zeroed, which blows up use cases for sharable 
> memory.
>   So the default for sharable memory must be devdax mode.
(CXL specific diversion)

Absolutely agree this this. There is a 'corner' that irritates me in the spec 
though
which is that there is no distinction between shareable and shared capacity.
If we are in a constrained setup with limited HPA or DPA space, we may not want
to have separate DCD regions for these.  Thus it is plausible that an 
orchestrator
might tell a memory appliance to present memory for general use and yet it
surfaces as shareable.  So there may need to be an opt in path at least for
going ahead and using this memory as normal RAM.

> * Tags are mandatory for sharable allocations, and allowed but optional for
>   non-sharable allocations. The implication is that non-sharable allocations
>   may get onlined automatically as system-ram, so we don't need a namespace
>   for those. (I argued for mandatory tags on all allocations - hey you don't
>   have to use them - but encountered objections and dropped it.)
> * CXL access control only goes to host root ports; CXL has no concept of
>   giving access to a VM. So some component on a host (perhaps logically
>   an orchestrator component) needs to plumb memory to VMs as appropriate.

Yes.  It's some mashup of an orchestrator and VMM / libvirt, local library
of your choice. We can just group into into the ill defined concept of
a distributed orchestrator.

> 
> So tags are a namespace to find specific memory "allocations" (which in the
> CXL consortium, we usually refer to as "tagged capacity").
> 
> In an orchestrated environment, the orchestrator would allocate resources
> (including tagged memory capacity), make that capacity visible on the right
> host(s), and then provide the tag when starting the app if needed.
> 
> if (e.g.) the memory cotains a famfs file system, famfs needs the uuid of the
> root memory allocation to find the right memory device. Once mounted, it's a
> file sytem so apps can be directed to the mount path. Apps that consume the
> dax devices directly also need the uuid because /dev/dax0.0 is not invariant
> across a cluster...
> 
> I have been assuming that when the CXL stack discovers a new DCD allocation,
> it will configure the devdax device and provide some way to find it by tag.
> /sys/cxl//dev or whatever. That works as far as

Re: [RFC-PATCH v2] vhost-user: add a request-reply lock

2024-08-19 Thread Michael S. Tsirkin
On Mon, Aug 19, 2024 at 05:32:48PM +0530, Prasad Pandit wrote:
> From: Prasad Pandit 
> 
> QEMU threads use vhost_user_write/read calls to send
> and receive request/reply messages from a vhost-user
> device. When multiple threads communicate with the
> same vhost-user device, they can receive each other's
> messages, resulting in an erroneous state.
> 
> When fault_thread exits upon completion of Postcopy
> migration, it sends a 'postcopy_end' message to the
> vhost-user device. But sometimes 'postcopy_end' message
> is sent while vhost device is being setup via
> vhost_dev_start().
> 
>  Thread-1   Thread-2
> 
>  vhost_dev_startpostcopy_ram_incoming_cleanup
>  vhost_device_iotlb_misspostcopy_notify
>  vhost_backend_update_device_iotlb  vhost_user_postcopy_notifier
>  vhost_user_send_device_iotlb_msg   vhost_user_postcopy_end
>  process_message_reply  process_message_reply
>  vhost_user_readvhost_user_read
>  vhost_user_read_header vhost_user_read_header
>  "Fail to update device iotlb"  "Failed to receive reply to postcopy_end"
> 
> This creates confusion when vhost-user device receives
> 'postcopy_end' message while it is trying to update
> IOTLB entries.
> 
>  vhost_user_read_header:
>   700871,700871: Failed to read msg header. Flags 0x0 instead of 0x5.
>  vhost_device_iotlb_miss:
>   700871,700871: Fail to update device iotlb
>  vhost_user_postcopy_end:
>   700871,700900: Failed to receive reply to postcopy_end
>  vhost_user_read_header:
>   700871,700871: Failed to read msg header. Flags 0x0 instead of 0x5.
> 
> Here fault thread seems to end the postcopy migration
> while another thread is starting the vhost-user device.
> 
> Add a mutex lock to hold for one request-reply cycle
> and avoid such race condition.
> 
> Fixes: 46343570c06e ("vhost+postcopy: Wire up POSTCOPY_END notify")
> Suggested-by: Peter Xu 
> Signed-off-by: Prasad Pandit 

makes sense.
Acked-by: Michael S. Tsirkin 
But do not post v2 as reply to v1 pls.

> ---
>  hw/virtio/vhost-user.c | 74 ++
>  include/hw/virtio/vhost-user.h |  3 ++
>  2 files changed, 77 insertions(+)
> 
> v2:
>  - Place QEMU_LOCK_GUARD near the vhost_user_write() calls, holding
>the lock for longer fails some tests during rpmbuild(8).
>  - rpmbuild(8) fails for some SRPMs, not all. RHEL-9 SRPM builds with
>this patch, whereas Fedora SRPM does not build.
>  - The host OS also seems to affect rpmbuild(8). Some SRPMs build well
>on RHEL-9, but not on Fedora-40 machine.
> 
> v1: 
> https://lore.kernel.org/qemu-devel/20240808095147.291626-3-ppan...@redhat.com/#R
> 
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index 00561daa06..7b030ae2cd 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -24,6 +24,7 @@
>  #include "qemu/main-loop.h"
>  #include "qemu/uuid.h"
>  #include "qemu/sockets.h"
> +#include "qemu/lockable.h"
>  #include "sysemu/runstate.h"
>  #include "sysemu/cryptodev.h"
>  #include "migration/postcopy-ram.h"
> @@ -446,6 +447,10 @@ static int vhost_user_set_log_base(struct vhost_dev 
> *dev, uint64_t base,
>  .hdr.size = sizeof(msg.payload.log),
>  };
>  
> +struct vhost_user *u = dev->opaque;
> +struct VhostUserState *us = u->user;
> +QEMU_LOCK_GUARD(&us->vhost_user_request_reply_lock);
> +
>  /* Send only once with first queue pair */
>  if (dev->vq_index != 0) {
>  return 0;
> @@ -664,6 +669,7 @@ static int send_remove_regions(struct vhost_dev *dev,
> bool reply_supported)
>  {
>  struct vhost_user *u = dev->opaque;
> +struct VhostUserState *us = u->user;
>  struct vhost_memory_region *shadow_reg;
>  int i, fd, shadow_reg_idx, ret;
>  ram_addr_t offset;
> @@ -685,6 +691,8 @@ static int send_remove_regions(struct vhost_dev *dev,
>  vhost_user_fill_msg_region(®ion_buffer, shadow_reg, 0);
>  msg->payload.mem_reg.region = region_buffer;
>  
> +QEMU_LOCK_GUARD(&us->vhost_user_request_reply_lock);
> +
>  ret = vhost_user_write(dev, msg, NULL, 0);
>  if (ret < 0) {
>  return ret;
> @@ -718,6 +726,7 @@ static int send_add_regions(struct vhost_dev *dev,
>  bool reply_supported, bool track_ramblocks)
>  {
>  struct vhost_user *u = dev->opaque;
> +struct VhostUserState *us = u->user;
>  int i, fd, ret, reg_idx, reg_fd_idx;
>  struct vhost_memory_region *reg;
>  MemoryRegion *mr;
> @@ -746,6 +755,8 @@ static int send_add_regions(struct vhost_dev *dev,
>  vhost_user_fill_msg_region(®ion_buffer, reg, offset);
>  msg->payload.mem_reg.region = region_buffer;
>  
> +QEMU_LOCK_GUARD(&us->vhost_user_request_reply_lock);
> +
>  ret = vhost_user_write(dev, msg, &fd, 1);
>  if (ret < 0) {
>

Re: [RFC-PATCH v2] vhost-user: add a request-reply lock

2024-08-19 Thread Michael S. Tsirkin
On Mon, Aug 19, 2024 at 11:42:02AM -0400, Michael S. Tsirkin wrote:
> On Mon, Aug 19, 2024 at 05:32:48PM +0530, Prasad Pandit wrote:
> > From: Prasad Pandit 
> > 
> > QEMU threads use vhost_user_write/read calls to send
> > and receive request/reply messages from a vhost-user
> > device. When multiple threads communicate with the
> > same vhost-user device, they can receive each other's
> > messages, resulting in an erroneous state.
> > 
> > When fault_thread exits upon completion of Postcopy
> > migration, it sends a 'postcopy_end' message to the
> > vhost-user device. But sometimes 'postcopy_end' message
> > is sent while vhost device is being setup via
> > vhost_dev_start().
> > 
> >  Thread-1   Thread-2
> > 
> >  vhost_dev_startpostcopy_ram_incoming_cleanup
> >  vhost_device_iotlb_misspostcopy_notify
> >  vhost_backend_update_device_iotlb  vhost_user_postcopy_notifier
> >  vhost_user_send_device_iotlb_msg   vhost_user_postcopy_end
> >  process_message_reply  process_message_reply
> >  vhost_user_readvhost_user_read
> >  vhost_user_read_header vhost_user_read_header
> >  "Fail to update device iotlb"  "Failed to receive reply to 
> > postcopy_end"
> > 
> > This creates confusion when vhost-user device receives
> > 'postcopy_end' message while it is trying to update
> > IOTLB entries.
> > 
> >  vhost_user_read_header:
> >   700871,700871: Failed to read msg header. Flags 0x0 instead of 0x5.
> >  vhost_device_iotlb_miss:
> >   700871,700871: Fail to update device iotlb
> >  vhost_user_postcopy_end:
> >   700871,700900: Failed to receive reply to postcopy_end
> >  vhost_user_read_header:
> >   700871,700871: Failed to read msg header. Flags 0x0 instead of 0x5.
> > 
> > Here fault thread seems to end the postcopy migration
> > while another thread is starting the vhost-user device.
> > 
> > Add a mutex lock to hold for one request-reply cycle
> > and avoid such race condition.
> > 
> > Fixes: 46343570c06e ("vhost+postcopy: Wire up POSTCOPY_END notify")
> > Suggested-by: Peter Xu 
> > Signed-off-by: Prasad Pandit 
> 
> makes sense.
> Acked-by: Michael S. Tsirkin 
> But do not post v2 as reply to v1 pls.


Also, looks like this will replace Message-Id: 
<20240801124540.38774-1-xiangwench...@dayudpu.com>
correct?

> > ---
> >  hw/virtio/vhost-user.c | 74 ++
> >  include/hw/virtio/vhost-user.h |  3 ++
> >  2 files changed, 77 insertions(+)
> > 
> > v2:
> >  - Place QEMU_LOCK_GUARD near the vhost_user_write() calls, holding
> >the lock for longer fails some tests during rpmbuild(8).
> >  - rpmbuild(8) fails for some SRPMs, not all. RHEL-9 SRPM builds with
> >this patch, whereas Fedora SRPM does not build.
> >  - The host OS also seems to affect rpmbuild(8). Some SRPMs build well
> >on RHEL-9, but not on Fedora-40 machine.
> > 
> > v1: 
> > https://lore.kernel.org/qemu-devel/20240808095147.291626-3-ppan...@redhat.com/#R
> > 
> > diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> > index 00561daa06..7b030ae2cd 100644
> > --- a/hw/virtio/vhost-user.c
> > +++ b/hw/virtio/vhost-user.c
> > @@ -24,6 +24,7 @@
> >  #include "qemu/main-loop.h"
> >  #include "qemu/uuid.h"
> >  #include "qemu/sockets.h"
> > +#include "qemu/lockable.h"
> >  #include "sysemu/runstate.h"
> >  #include "sysemu/cryptodev.h"
> >  #include "migration/postcopy-ram.h"
> > @@ -446,6 +447,10 @@ static int vhost_user_set_log_base(struct vhost_dev 
> > *dev, uint64_t base,
> >  .hdr.size = sizeof(msg.payload.log),
> >  };
> >  
> > +struct vhost_user *u = dev->opaque;
> > +struct VhostUserState *us = u->user;
> > +QEMU_LOCK_GUARD(&us->vhost_user_request_reply_lock);
> > +
> >  /* Send only once with first queue pair */
> >  if (dev->vq_index != 0) {
> >  return 0;
> > @@ -664,6 +669,7 @@ static int send_remove_regions(struct vhost_dev *dev,
> > bool reply_supported)
> >  {
> >  struct vhost_user *u = dev->opaque;
> > +struct VhostUserState *us = u->user;
> >  struct vhost_memory_region *shadow_reg;
> >  int i, fd, shadow_reg_idx, ret;
> >  ram_addr_t offset;
> > @@ -685,6 +691,8 @@ static int send_remove_regions(struct vhost_dev *dev,
> >  vhost_user_fill_msg_region(®ion_buffer, shadow_reg, 0);
> >  msg->payload.mem_reg.region = region_buffer;
> >  
> > +QEMU_LOCK_GUARD(&us->vhost_user_request_reply_lock);
> > +
> >  ret = vhost_user_write(dev, msg, NULL, 0);
> >  if (ret < 0) {
> >  return ret;
> > @@ -718,6 +726,7 @@ static int send_add_regions(struct vhost_dev *dev,
> >  bool reply_supported, bool track_ramblocks)
> >  {
> >  struct vhost_user *u = dev->opaque;
> > +struct VhostUserState *us = u->user;
> >  int i, fd, ret, reg_idx, reg_fd_idx;
> >  struct vhost_memory_regio

Re: [PATCH v4 4/6] machine/nitro-enclave: Add built-in Nitro Secure Module device

2024-08-19 Thread Daniel P . Berrangé
On Mon, Aug 19, 2024 at 09:32:55PM +0600, Dorjoy Chowdhury wrote:
> On Mon, Aug 19, 2024 at 4:13 PM Alexander Graf  wrote:
> >
> > Hey Dorjoy,
> >
> > On 18.08.24 13:42, Dorjoy Chowdhury wrote:
> > > AWS Nitro Enclaves have built-in Nitro Secure Module (NSM) device which
> > > is used for stripped down TPM functionality like attestation. This commit
> > > adds the built-in NSM device in the nitro-enclave machine type.
> > >
> > > In Nitro Enclaves, all the PCRs start in a known zero state and the first
> > > 16 PCRs are locked from boot and reserved. The PCR0, PCR1, PCR2 and PCR8
> > > contain the SHA384 hashes related to the EIF file used to boot the
> > > VM for validation.
> > >
> > > Some optional nitro-enclave machine options have been added:
> > >  - 'id': Enclave identifier, reflected in the module-id of the NSM
> > > device. If not provided, a default id will be set.
> > >  - 'parent-role': Parent instance IAM role ARN, reflected in PCR3
> > > of the NSM device.
> > >  - 'parent-id': Parent instance identifier, reflected in PCR4 of the
> > > NSM device.
> > >
> > > Signed-off-by: Dorjoy Chowdhury 
> > > ---
> > >   crypto/meson.build  |   2 +-
> > >   crypto/x509-utils.c |  73 +++
> >
> >
> > Can you please put this new API into its own patch file?
> >
> >
> > >   hw/core/eif.c   | 225 +---
> > >   hw/core/eif.h   |   5 +-
> >
> >
> > These changes to eif.c should ideally already be part of the patch that
> > introduces eif.c (patch 1), no? In fact, do you think you can make the
> > whole eif logic its own patch file?
> >
> >
> > >   hw/core/meson.build |   4 +-
> > >   hw/i386/Kconfig |   1 +
> > >   hw/i386/nitro_enclave.c | 141 +++-
> > >   include/crypto/x509-utils.h |  22 
> > >   include/hw/i386/nitro_enclave.h |  26 
> > >   9 files changed, 479 insertions(+), 20 deletions(-)
> > >   create mode 100644 crypto/x509-utils.c
> > >   create mode 100644 include/crypto/x509-utils.h
> > >
> > > diff --git a/crypto/meson.build b/crypto/meson.build
> > > index c46f9c22a7..09633194ed 100644
> > > --- a/crypto/meson.build
> > > +++ b/crypto/meson.build
> > > @@ -62,7 +62,7 @@ endif
> > >   if gcrypt.found()
> > > util_ss.add(gcrypt, files('random-gcrypt.c'))
> > >   elif gnutls.found()
> > > -  util_ss.add(gnutls, files('random-gnutls.c'))
> > > +  util_ss.add(gnutls, files('random-gnutls.c', 'x509-utils.c'))
> >
> >
> > What if we don't have gnutls. Will everything still compile or do we
> > need to add any dependencies?
> >
> >
> 
> [...]
> 
> > >
> > > diff --git a/hw/core/meson.build b/hw/core/meson.build
> > > index f32d1ad943..8dc4552e35 100644
> > > --- a/hw/core/meson.build
> > > +++ b/hw/core/meson.build
> > > @@ -12,6 +12,8 @@ hwcore_ss.add(files(
> > > 'qdev-clock.c',
> > >   ))
> > >
> > > +libcbor = dependency('libcbor', version: '>=0.7.0')
> > > +
> > >   common_ss.add(files('cpu-common.c'))
> > >   common_ss.add(files('machine-smp.c'))
> > >   system_ss.add(when: 'CONFIG_FITLOADER', if_true: files('loader-fit.c'))
> > > @@ -24,7 +26,7 @@ system_ss.add(when: 'CONFIG_REGISTER', if_true: 
> > > files('register.c'))
> > >   system_ss.add(when: 'CONFIG_SPLIT_IRQ', if_true: files('split-irq.c'))
> > >   system_ss.add(when: 'CONFIG_XILINX_AXI', if_true: files('stream.c'))
> > >   system_ss.add(when: 'CONFIG_PLATFORM_BUS', if_true: 
> > > files('sysbus-fdt.c'))
> > > -system_ss.add(when: 'CONFIG_NITRO_ENCLAVE', if_true: [files('eif.c'), 
> > > zlib])
> > > +system_ss.add(when: 'CONFIG_NITRO_ENCLAVE', if_true: [files('eif.c'), 
> > > zlib, libcbor, gnutls])
> >
> >
> > Ah, you add the gnutls dependency here. Great! However, this means we
> > now make gnutls (and libcbor) a mandatory dependency for the default
> > configuration. Does configure know about that? I believe before gnutls
> > was optional, right?
> >
> 
> I see gnutls is not a required dependency in the root meson.build. I
> am not sure what we should do here.
> 
> Hey Daniel, do you have any suggestions about how this dependency
> should be included?

Unconditionally build the crypto/x509-utils.c file, but in that put
file #ifdef CONFIG_GNUTLS, and in the #else put a stub impl of the
method that just calls error_setg().

That way you can compile everything without any hard dep on gnutls,
but if someone tries to use it they'll get a runtime error when
gnutls is not built


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v4 4/6] machine/nitro-enclave: Add built-in Nitro Secure Module device

2024-08-19 Thread Alexander Graf


On 19.08.24 17:28, Dorjoy Chowdhury wrote:

Hey Alex,

On Mon, Aug 19, 2024 at 4:13 PM Alexander Graf  wrote:

Hey Dorjoy,

On 18.08.24 13:42, Dorjoy Chowdhury wrote:

AWS Nitro Enclaves have built-in Nitro Secure Module (NSM) device which
is used for stripped down TPM functionality like attestation. This commit
adds the built-in NSM device in the nitro-enclave machine type.

In Nitro Enclaves, all the PCRs start in a known zero state and the first
16 PCRs are locked from boot and reserved. The PCR0, PCR1, PCR2 and PCR8
contain the SHA384 hashes related to the EIF file used to boot the
VM for validation.

Some optional nitro-enclave machine options have been added:
  - 'id': Enclave identifier, reflected in the module-id of the NSM
device. If not provided, a default id will be set.
  - 'parent-role': Parent instance IAM role ARN, reflected in PCR3
of the NSM device.
  - 'parent-id': Parent instance identifier, reflected in PCR4 of the
NSM device.

Signed-off-by: Dorjoy Chowdhury 
---
   crypto/meson.build  |   2 +-
   crypto/x509-utils.c |  73 +++


Can you please put this new API into its own patch file?



   hw/core/eif.c   | 225 +---
   hw/core/eif.h   |   5 +-


These changes to eif.c should ideally already be part of the patch that
introduces eif.c (patch 1), no? In fact, do you think you can make the
whole eif logic its own patch file?


Good point. I guess it should be possible if I have the virtio-nsm
device commit first and then add the machine/nitro-enclave commit with
full support with the devices. That will of course make the
machine/nitro-enclave commit larger. What do you think?



As long as nothing compiles the code, it can rely on not yet implemented 
functions. So it's perfectly legit to add all your code in individual 
commits and then at the end add the meson.build change that implements 
the config option. How about the order below?


* Crypto patch for SHA384
* Crypto patch for x509 fingerprint
* NSM device emulation (including libcbor check, introduces 
CONFIG_VIRTIO_NSM)

* EIF format parsing (not compiled yet)
* Nitro Enclaves machine (introduces CONFIG_NITRO_ENCLAVE)
* Nitro Enclaves docs


Alex




Amazon Web Services Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597


Re: [PATCH v4 4/6] machine/nitro-enclave: Add built-in Nitro Secure Module device

2024-08-19 Thread Dorjoy Chowdhury
On Mon, Aug 19, 2024 at 9:53 PM Daniel P. Berrangé  wrote:
>
> On Mon, Aug 19, 2024 at 09:32:55PM +0600, Dorjoy Chowdhury wrote:
> > On Mon, Aug 19, 2024 at 4:13 PM Alexander Graf  wrote:
> > >
> > > Hey Dorjoy,
> > >
> > > On 18.08.24 13:42, Dorjoy Chowdhury wrote:
> > > > AWS Nitro Enclaves have built-in Nitro Secure Module (NSM) device which
> > > > is used for stripped down TPM functionality like attestation. This 
> > > > commit
> > > > adds the built-in NSM device in the nitro-enclave machine type.
> > > >
> > > > In Nitro Enclaves, all the PCRs start in a known zero state and the 
> > > > first
> > > > 16 PCRs are locked from boot and reserved. The PCR0, PCR1, PCR2 and PCR8
> > > > contain the SHA384 hashes related to the EIF file used to boot the
> > > > VM for validation.
> > > >
> > > > Some optional nitro-enclave machine options have been added:
> > > >  - 'id': Enclave identifier, reflected in the module-id of the NSM
> > > > device. If not provided, a default id will be set.
> > > >  - 'parent-role': Parent instance IAM role ARN, reflected in PCR3
> > > > of the NSM device.
> > > >  - 'parent-id': Parent instance identifier, reflected in PCR4 of the
> > > > NSM device.
> > > >
> > > > Signed-off-by: Dorjoy Chowdhury 
> > > > ---
> > > >   crypto/meson.build  |   2 +-
> > > >   crypto/x509-utils.c |  73 +++
> > >
> > >
> > > Can you please put this new API into its own patch file?
> > >
> > >
> > > >   hw/core/eif.c   | 225 +---
> > > >   hw/core/eif.h   |   5 +-
> > >
> > >
> > > These changes to eif.c should ideally already be part of the patch that
> > > introduces eif.c (patch 1), no? In fact, do you think you can make the
> > > whole eif logic its own patch file?
> > >
> > >
> > > >   hw/core/meson.build |   4 +-
> > > >   hw/i386/Kconfig |   1 +
> > > >   hw/i386/nitro_enclave.c | 141 +++-
> > > >   include/crypto/x509-utils.h |  22 
> > > >   include/hw/i386/nitro_enclave.h |  26 
> > > >   9 files changed, 479 insertions(+), 20 deletions(-)
> > > >   create mode 100644 crypto/x509-utils.c
> > > >   create mode 100644 include/crypto/x509-utils.h
> > > >
> > > > diff --git a/crypto/meson.build b/crypto/meson.build
> > > > index c46f9c22a7..09633194ed 100644
> > > > --- a/crypto/meson.build
> > > > +++ b/crypto/meson.build
> > > > @@ -62,7 +62,7 @@ endif
> > > >   if gcrypt.found()
> > > > util_ss.add(gcrypt, files('random-gcrypt.c'))
> > > >   elif gnutls.found()
> > > > -  util_ss.add(gnutls, files('random-gnutls.c'))
> > > > +  util_ss.add(gnutls, files('random-gnutls.c', 'x509-utils.c'))
> > >
> > >
> > > What if we don't have gnutls. Will everything still compile or do we
> > > need to add any dependencies?
> > >
> > >
> >
> > [...]
> >
> > > >
> > > > diff --git a/hw/core/meson.build b/hw/core/meson.build
> > > > index f32d1ad943..8dc4552e35 100644
> > > > --- a/hw/core/meson.build
> > > > +++ b/hw/core/meson.build
> > > > @@ -12,6 +12,8 @@ hwcore_ss.add(files(
> > > > 'qdev-clock.c',
> > > >   ))
> > > >
> > > > +libcbor = dependency('libcbor', version: '>=0.7.0')
> > > > +
> > > >   common_ss.add(files('cpu-common.c'))
> > > >   common_ss.add(files('machine-smp.c'))
> > > >   system_ss.add(when: 'CONFIG_FITLOADER', if_true: 
> > > > files('loader-fit.c'))
> > > > @@ -24,7 +26,7 @@ system_ss.add(when: 'CONFIG_REGISTER', if_true: 
> > > > files('register.c'))
> > > >   system_ss.add(when: 'CONFIG_SPLIT_IRQ', if_true: files('split-irq.c'))
> > > >   system_ss.add(when: 'CONFIG_XILINX_AXI', if_true: files('stream.c'))
> > > >   system_ss.add(when: 'CONFIG_PLATFORM_BUS', if_true: 
> > > > files('sysbus-fdt.c'))
> > > > -system_ss.add(when: 'CONFIG_NITRO_ENCLAVE', if_true: [files('eif.c'), 
> > > > zlib])
> > > > +system_ss.add(when: 'CONFIG_NITRO_ENCLAVE', if_true: [files('eif.c'), 
> > > > zlib, libcbor, gnutls])
> > >
> > >
> > > Ah, you add the gnutls dependency here. Great! However, this means we
> > > now make gnutls (and libcbor) a mandatory dependency for the default
> > > configuration. Does configure know about that? I believe before gnutls
> > > was optional, right?
> > >
> >
> > I see gnutls is not a required dependency in the root meson.build. I
> > am not sure what we should do here.
> >
> > Hey Daniel, do you have any suggestions about how this dependency
> > should be included?
>
> Unconditionally build the crypto/x509-utils.c file, but in that put
> file #ifdef CONFIG_GNUTLS, and in the #else put a stub impl of the
> method that just calls error_setg().
>
> That way you can compile everything without any hard dep on gnutls,
> but if someone tries to use it they'll get a runtime error when
> gnutls is not built
>

Understood. Thanks! What should I do about libcbor? That one is
required for building nitro-enclave and virtio-nsm. Should I make that
required in the root meson.build file?

Regards,
Dorj

[PATCH 00/11 v2] RISC-V: support CLIC v0.9 specification

2024-08-19 Thread Ian Brockbank
[Resubmission now the merge is correct]

This patch set gives an implementation of "RISC-V Core-Local Interrupt
Controller(CLIC) Version 0.9-draft-20210217". It comes from [1], where
you can find the pdf format or the source code.

This is based on the implementation from 2021 by Liu Zhiwei [3], who took
over the job from Michael Clark, who gave the first implementation of
clic-v0.7 specification [2]. I believe this implementation addresses all
the comments in Liu Zhiwei's RFC patch thread.

This implementation follows the CLIC 0.9-stable draft at 14 March 2024,
with the following exceptions and implementation details:
 - the CLIC control registers are memory-mapped as per earlier drafts (in
   particular version 0.9-draft, 20 June 2023)
 - the indirect CSR control in 0.9-stable is not implemented
 - the vector table can be either handler addresses (as per the spec)
   or a jump table where each entry is processed as an instruction,
   selectable with version number v0.9-jmp
 - each hart is assigned its own CLIC block
 - if PRV_S and/or PRV_M are supported, they are currently assumed to follow
   the PRV_M registers; a subsequent update will address this
 - support for PRV_S and PRV_M is selectable at CLIC instantiation
 - PRV_S and PRV_U registers are currently separate from PRV_M; a subsequent
   update will turn them into filtered views onto the PRV_M registers
 - each hart is assigned its own CLIC block
 - support for PRV_S and PRV_M is selectable at CLIC instantiation by
   passing in a base address for the given modes; a base address of 0 is
   treated as not supported
 - PRV_S and PRV_U registers are mapped  onto the PRV_M controls with
   appropriate filtering for the access mode
 - the RISCV virt machine has been updated to allow CLIC emulation by
   passing "machine=virt,clic=on" on the command line; various other
   parameters have been added to allow finer control of the CLIC behavior

The implementation (in jump-table mode) has been verified to match the
Cirrus Logic silicon (PRV_M only), which is based upon the Pulp
implementation [4] as of June 2023.

The implementation also includes a selection of qtests designed to verify
operation in all possible combinations of PRV_M, PRV_S and PRV_U.

[1] specification website: https://github.com/riscv/riscv-fast-interrupt.
[2] Michael Clark origin work:
https://github.com/sifive/riscv-qemu/tree/sifive-clic.
[3] RFC Patch submission by Liu Zhiwei:
https://lists.gnu.org/archive/html/qemu-devel/2021-04/msg01417.html
[4] Pulp implementation of CLIC: https://github.com/pulp-platform/clic

Ian Brockbank (11):
target/riscv: Add CLIC CSR mintstatus
target/riscv: Update CSR xintthresh in CLIC mode
hw/intc: Add CLIC device
target/riscv: Update CSR xie in CLIC mode
target/riscv: Update CSR xip in CLIC mode
target/riscv: Update CSR xtvec in CLIC mode
target/riscv: Update CSR xnxti in CLIC mode
target/riscv: Update interrupt handling in CLIC mode
target/riscv: Update interrupt return in CLIC mode
hw/riscv: add CLIC into virt machine
tests: add riscv clic qtest case and a function in qtest

This message and any attachments may contain privileged and confidential 
information that is intended solely for the person(s) to whom it is addressed. 
If you are not an intended recipient you must not: read; copy; distribute; 
discuss; take any action in or make any reliance upon the contents of this 
message; nor open or read any attachment. If you have received this message in 
error, please notify us as soon as possible on the following telephone number 
and destroy this message including any attachments. Thank you. Cirrus Logic 
International (UK) Ltd and Cirrus Logic International Semiconductor Ltd are 
companies registered in Scotland, with registered numbers SC089839 and SC495735 
respectively. Our registered office is at 7B Nightingale Way, Quartermile, 
Edinburgh, EH3 9EG, UK. Tel: +44 (0)131 272 7000. www.cirrus.com



[PATCH 01/11 v2] target/riscv: Add CLIC CSR mintstatus

2024-08-19 Thread Ian Brockbank
From: Ian Brockbank 

CSR mintstatus holds the active interrupt level for each supported
privilege mode. sintstatus, and user, uintstatus, provide restricted
views of mintstatus.

Signed-off-by: Ian Brockbank 
Signed-off-by: LIU Zhiwei 
---
 target/riscv/cpu.h  |  3 +++
 target/riscv/cpu_bits.h | 11 +++
 target/riscv/csr.c  | 31 +++
 3 files changed, 45 insertions(+)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 1619c3acb6..95303f50d3 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -259,6 +259,7 @@ struct CPUArchState {
 bool software_seip;

 uint64_t miclaim;
+uint64_t mintstatus; /* clic-spec */

 uint64_t mie;
 uint64_t mideleg;
@@ -461,6 +462,8 @@ struct CPUArchState {
 QEMUTimer *vstimer; /* Internal timer for VS-mode interrupt */
 bool vstime_irq;

+void *clic;   /* clic interrupt controller */
+
 hwaddr kernel_addr;
 hwaddr fdt_addr;

diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h
index 32b068f18a..2e65495b54 100644
--- a/target/riscv/cpu_bits.h
+++ b/target/riscv/cpu_bits.h
@@ -165,6 +165,7 @@
 #define CSR_MCAUSE  0x342
 #define CSR_MTVAL   0x343
 #define CSR_MIP 0x344
+#define CSR_MINTSTATUS  0xfb1 /* clic-spec-draft */

 /* Machine-Level Window to Indirectly Accessed Registers (AIA) */
 #define CSR_MISELECT0x350
@@ -206,6 +207,7 @@
 #define CSR_SCAUSE  0x142
 #define CSR_STVAL   0x143
 #define CSR_SIP 0x144
+#define CSR_SINTSTATUS  0xdb1 /* clic-spec-draft */

 /* Sstc supervisor CSRs */
 #define CSR_STIMECMP0x14D
@@ -733,6 +735,15 @@ typedef enum RISCVException {
 #define SIP_SEIP   MIP_SEIP
 #define SIP_LCOFIP MIP_LCOFIP

+/* mintstatus */
+#define MINTSTATUS_MIL 0xff00 /* mil[31:24] */
+#define MINTSTATUS_SIL 0xff00 /* sil[15:8] */
+#define MINTSTATUS_UIL 0x00ff /* uil[7:0] */
+
+/* sintstatus */
+#define SINTSTATUS_SIL 0xff00 /* sil[15:8] */
+#define SINTSTATUS_UIL 0x00ff /* uil[7:0] */
+
 /* MIE masks */
 #define MIE_SEIE   (1 << IRQ_S_EXT)
 #define MIE_UEIE   (1 << IRQ_U_EXT)
diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index ea3560342c..f9ed7b9079 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -578,6 +578,16 @@ static RISCVException debug(CPURISCVState *env, int csrno)

 return RISCV_EXCP_ILLEGAL_INST;
 }
+
+static int clic(CPURISCVState *env, int csrno)
+{
+if (env->clic) {
+return RISCV_EXCP_NONE;
+}
+
+return RISCV_EXCP_ILLEGAL_INST;
+}
+
 #endif

 static RISCVException seed(CPURISCVState *env, int csrno)
@@ -2887,6 +2897,12 @@ static RISCVException rmw_mviph(CPURISCVState *env, int 
csrno,
 return ret;
 }

+static int read_mintstatus(CPURISCVState *env, int csrno, target_ulong *val)
+{
+*val = env->mintstatus;
+return RISCV_EXCP_NONE;
+}
+
 /* Supervisor Trap Setup */
 static RISCVException read_sstatus_i128(CPURISCVState *env, int csrno,
 Int128 *val)
@@ -3298,6 +3314,14 @@ static RISCVException rmw_siph(CPURISCVState *env, int 
csrno,
 return ret;
 }

+static int read_sintstatus(CPURISCVState *env, int csrno, target_ulong *val)
+{
+/* sintstatus is a filtered view of mintstatus with the PRV_M removed */
+target_ulong mask = SINTSTATUS_SIL | SINTSTATUS_UIL;
+*val = env->mintstatus & mask;
+return RISCV_EXCP_NONE;
+}
+
 /* Supervisor Protection and Translation */
 static RISCVException read_satp(CPURISCVState *env, int csrno,
 target_ulong *val)
@@ -5594,6 +5618,13 @@ riscv_csr_operations csr_ops[CSR_TABLE_SIZE] = {
  write_mhpmcounterh },
 [CSR_MHPMCOUNTER31H] = { "mhpmcounter31h", mctr32,  read_hpmcounterh,
  write_mhpmcounterh },
+
+/* Machine Mode Core Level Interrupt Controller */
+[CSR_MINTSTATUS] = { "mintstatus", clic,  read_mintstatus   },
+
+/* Supervisor Mode Core Level Interrupt Controller */
+[CSR_SINTSTATUS] = { "sintstatus", clic,  read_sintstatus   },
+
 [CSR_SCOUNTOVF]  = { "scountovf", sscofpmf,  read_scountovf,
  .min_priv_ver = PRIV_VERSION_1_12_0 },

--
2.46.0.windows.1
This message and any attachments may contain privileged and confidential 
information that is intended solely for the person(s) to whom it is addressed. 
If you are not an intended recipient you must not: read; copy; distribute; 
discuss; take any action in or make any reliance upon the contents of this 
message; nor open or read any attachment. If you have received this message in 
error, please notify us as soon as possible on the following telephone number 
an

Re: [PATCH v4 4/6] machine/nitro-enclave: Add built-in Nitro Secure Module device

2024-08-19 Thread Daniel P . Berrangé
On Mon, Aug 19, 2024 at 10:07:02PM +0600, Dorjoy Chowdhury wrote:
> On Mon, Aug 19, 2024 at 9:53 PM Daniel P. Berrangé  
> wrote:
> >
> > On Mon, Aug 19, 2024 at 09:32:55PM +0600, Dorjoy Chowdhury wrote:
> > > On Mon, Aug 19, 2024 at 4:13 PM Alexander Graf  wrote:
> > > >
> > > > Hey Dorjoy,
> > > >
> > > > On 18.08.24 13:42, Dorjoy Chowdhury wrote:
> > > > > AWS Nitro Enclaves have built-in Nitro Secure Module (NSM) device 
> > > > > which
> > > > > is used for stripped down TPM functionality like attestation. This 
> > > > > commit
> > > > > adds the built-in NSM device in the nitro-enclave machine type.
> > > > >
> > > > > In Nitro Enclaves, all the PCRs start in a known zero state and the 
> > > > > first
> > > > > 16 PCRs are locked from boot and reserved. The PCR0, PCR1, PCR2 and 
> > > > > PCR8
> > > > > contain the SHA384 hashes related to the EIF file used to boot the
> > > > > VM for validation.
> > > > >
> > > > > Some optional nitro-enclave machine options have been added:
> > > > >  - 'id': Enclave identifier, reflected in the module-id of the NSM
> > > > > device. If not provided, a default id will be set.
> > > > >  - 'parent-role': Parent instance IAM role ARN, reflected in PCR3
> > > > > of the NSM device.
> > > > >  - 'parent-id': Parent instance identifier, reflected in PCR4 of 
> > > > > the
> > > > > NSM device.
> > > > >
> > > > > Signed-off-by: Dorjoy Chowdhury 
> > > > > ---
> > > > >   crypto/meson.build  |   2 +-
> > > > >   crypto/x509-utils.c |  73 +++
> > > >
> > > >
> > > > Can you please put this new API into its own patch file?
> > > >
> > > >
> > > > >   hw/core/eif.c   | 225 
> > > > > +---
> > > > >   hw/core/eif.h   |   5 +-
> > > >
> > > >
> > > > These changes to eif.c should ideally already be part of the patch that
> > > > introduces eif.c (patch 1), no? In fact, do you think you can make the
> > > > whole eif logic its own patch file?
> > > >
> > > >
> > > > >   hw/core/meson.build |   4 +-
> > > > >   hw/i386/Kconfig |   1 +
> > > > >   hw/i386/nitro_enclave.c | 141 +++-
> > > > >   include/crypto/x509-utils.h |  22 
> > > > >   include/hw/i386/nitro_enclave.h |  26 
> > > > >   9 files changed, 479 insertions(+), 20 deletions(-)
> > > > >   create mode 100644 crypto/x509-utils.c
> > > > >   create mode 100644 include/crypto/x509-utils.h
> > > > >
> > > > > diff --git a/crypto/meson.build b/crypto/meson.build
> > > > > index c46f9c22a7..09633194ed 100644
> > > > > --- a/crypto/meson.build
> > > > > +++ b/crypto/meson.build
> > > > > @@ -62,7 +62,7 @@ endif
> > > > >   if gcrypt.found()
> > > > > util_ss.add(gcrypt, files('random-gcrypt.c'))
> > > > >   elif gnutls.found()
> > > > > -  util_ss.add(gnutls, files('random-gnutls.c'))
> > > > > +  util_ss.add(gnutls, files('random-gnutls.c', 'x509-utils.c'))
> > > >
> > > >
> > > > What if we don't have gnutls. Will everything still compile or do we
> > > > need to add any dependencies?
> > > >
> > > >
> > >
> > > [...]
> > >
> > > > >
> > > > > diff --git a/hw/core/meson.build b/hw/core/meson.build
> > > > > index f32d1ad943..8dc4552e35 100644
> > > > > --- a/hw/core/meson.build
> > > > > +++ b/hw/core/meson.build
> > > > > @@ -12,6 +12,8 @@ hwcore_ss.add(files(
> > > > > 'qdev-clock.c',
> > > > >   ))
> > > > >
> > > > > +libcbor = dependency('libcbor', version: '>=0.7.0')
> > > > > +
> > > > >   common_ss.add(files('cpu-common.c'))
> > > > >   common_ss.add(files('machine-smp.c'))
> > > > >   system_ss.add(when: 'CONFIG_FITLOADER', if_true: 
> > > > > files('loader-fit.c'))
> > > > > @@ -24,7 +26,7 @@ system_ss.add(when: 'CONFIG_REGISTER', if_true: 
> > > > > files('register.c'))
> > > > >   system_ss.add(when: 'CONFIG_SPLIT_IRQ', if_true: 
> > > > > files('split-irq.c'))
> > > > >   system_ss.add(when: 'CONFIG_XILINX_AXI', if_true: files('stream.c'))
> > > > >   system_ss.add(when: 'CONFIG_PLATFORM_BUS', if_true: 
> > > > > files('sysbus-fdt.c'))
> > > > > -system_ss.add(when: 'CONFIG_NITRO_ENCLAVE', if_true: 
> > > > > [files('eif.c'), zlib])
> > > > > +system_ss.add(when: 'CONFIG_NITRO_ENCLAVE', if_true: 
> > > > > [files('eif.c'), zlib, libcbor, gnutls])
> > > >
> > > >
> > > > Ah, you add the gnutls dependency here. Great! However, this means we
> > > > now make gnutls (and libcbor) a mandatory dependency for the default
> > > > configuration. Does configure know about that? I believe before gnutls
> > > > was optional, right?
> > > >
> > >
> > > I see gnutls is not a required dependency in the root meson.build. I
> > > am not sure what we should do here.
> > >
> > > Hey Daniel, do you have any suggestions about how this dependency
> > > should be included?
> >
> > Unconditionally build the crypto/x509-utils.c file, but in that put
> > file #ifdef CONFIG_GNUTLS, and in the #else put a stub impl of the
> > method that just calls error_setg(

[PATCH 02/11 v2] target/riscv: Update CSR xintthresh in CLIC mode

2024-08-19 Thread Ian Brockbank
From: Ian Brockbank 

The interrupt-level threshold (xintthresh) CSR holds an 8-bit field
for the threshold level of the associated privilege mode.

For horizontal interrupts, only the ones with higher interrupt levels
than the threshold level are allowed to preempt.

Signed-off-by: Ian Brockbank 
Signed-off-by: LIU Zhiwei 
---
 target/riscv/cpu.h  |  2 ++
 target/riscv/cpu_bits.h |  2 ++
 target/riscv/csr.c  | 28 
 3 files changed, 32 insertions(+)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 95303f50d3..9b5f36ad0a 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -260,6 +260,7 @@ struct CPUArchState {

 uint64_t miclaim;
 uint64_t mintstatus; /* clic-spec */
+target_ulong mintthresh; /* clic-spec */

 uint64_t mie;
 uint64_t mideleg;
@@ -283,6 +284,7 @@ struct CPUArchState {
 target_ulong stvec;
 target_ulong sepc;
 target_ulong scause;
+target_ulong sintthresh; /* clic-spec */

 target_ulong mtvec;
 target_ulong mepc;
diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h
index 2e65495b54..ad45402370 100644
--- a/target/riscv/cpu_bits.h
+++ b/target/riscv/cpu_bits.h
@@ -166,6 +166,7 @@
 #define CSR_MTVAL   0x343
 #define CSR_MIP 0x344
 #define CSR_MINTSTATUS  0xfb1 /* clic-spec-draft */
+#define CSR_MINTTHRESH  0x347 /* clic-spec-draft */

 /* Machine-Level Window to Indirectly Accessed Registers (AIA) */
 #define CSR_MISELECT0x350
@@ -208,6 +209,7 @@
 #define CSR_STVAL   0x143
 #define CSR_SIP 0x144
 #define CSR_SINTSTATUS  0xdb1 /* clic-spec-draft */
+#define CSR_SINTTHRESH  0x147 /* clic-spec-draft */

 /* Sstc supervisor CSRs */
 #define CSR_STIMECMP0x14D
diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index f9ed7b9079..9c824c0d8f 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -2903,6 +2903,18 @@ static int read_mintstatus(CPURISCVState *env, int 
csrno, target_ulong *val)
 return RISCV_EXCP_NONE;
 }

+static int read_mintthresh(CPURISCVState *env, int csrno, target_ulong *val)
+{
+*val = env->mintthresh;
+return 0;
+}
+
+static int write_mintthresh(CPURISCVState *env, int csrno, target_ulong val)
+{
+env->mintthresh = val;
+return 0;
+}
+
 /* Supervisor Trap Setup */
 static RISCVException read_sstatus_i128(CPURISCVState *env, int csrno,
 Int128 *val)
@@ -3322,6 +3334,18 @@ static int read_sintstatus(CPURISCVState *env, int 
csrno, target_ulong *val)
 return RISCV_EXCP_NONE;
 }

+static int read_sintthresh(CPURISCVState *env, int csrno, target_ulong *val)
+{
+*val = env->sintthresh;
+return 0;
+}
+
+static int write_sintthresh(CPURISCVState *env, int csrno, target_ulong val)
+{
+env->sintthresh = val;
+return 0;
+}
+
 /* Supervisor Protection and Translation */
 static RISCVException read_satp(CPURISCVState *env, int csrno,
 target_ulong *val)
@@ -5621,9 +5645,13 @@ riscv_csr_operations csr_ops[CSR_TABLE_SIZE] = {

 /* Machine Mode Core Level Interrupt Controller */
 [CSR_MINTSTATUS] = { "mintstatus", clic,  read_mintstatus   },
+[CSR_MINTTHRESH] = { "mintthresh", clic,  read_mintthresh,
+ write_mintthresh },

 /* Supervisor Mode Core Level Interrupt Controller */
 [CSR_SINTSTATUS] = { "sintstatus", clic,  read_sintstatus   },
+[CSR_SINTTHRESH] = { "sintthresh", clic,  read_sintthresh,
+ write_sintthresh },

 [CSR_SCOUNTOVF]  = { "scountovf", sscofpmf,  read_scountovf,
  .min_priv_ver = PRIV_VERSION_1_12_0 },
--
2.46.0.windows.1
This message and any attachments may contain privileged and confidential 
information that is intended solely for the person(s) to whom it is addressed. 
If you are not an intended recipient you must not: read; copy; distribute; 
discuss; take any action in or make any reliance upon the contents of this 
message; nor open or read any attachment. If you have received this message in 
error, please notify us as soon as possible on the following telephone number 
and destroy this message including any attachments. Thank you. Cirrus Logic 
International (UK) Ltd and Cirrus Logic International Semiconductor Ltd are 
companies registered in Scotland, with registered numbers SC089839 and SC495735 
respectively. Our registered office is at 7B Nightingale Way, Quartermile, 
Edinburgh, EH3 9EG, UK. Tel: +44 (0)131 272 7000. www.cirrus.com



[PATCH 10/11 v2] hw/riscv: add CLIC into virt machine

2024-08-19 Thread Ian Brockbank
Signed-off-by: Ian Brockbank
---
 hw/riscv/virt.c | 235 +++-
 include/hw/riscv/virt.h |  35 ++
 2 files changed, 267 insertions(+), 3 deletions(-)

diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index cef41c150a..68d614ad5c 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -1,4 +1,4 @@
-/*
+ /*
  * QEMU RISC-V VirtIO Board
  *
  * Copyright (c) 2017 SiFive, Inc.
@@ -39,6 +39,7 @@
 #include "hw/firmware/smbios.h"
 #include "hw/intc/riscv_aclint.h"
 #include "hw/intc/riscv_aplic.h"
+#include "hw/intc/riscv_clic.h"
 #include "hw/intc/sifive_plic.h"
 #include "hw/misc/sifive_test.h"
 #include "hw/platform-bus.h"
@@ -72,6 +73,7 @@ static const MemMapEntry virt_memmap[] = {
 [VIRT_MROM] = { 0x1000,0xf000 },
 [VIRT_TEST] = {   0x10,0x1000 },
 [VIRT_RTC] =  {   0x101000,0x1000 },
+[VIRT_CLIC] = {  0x200, VIRT_CLIC_MAX_SIZE(VIRT_CPUS_MAX)},
 [VIRT_CLINT] ={  0x200,   0x1 },
 [VIRT_ACLINT_SSWI] =  {  0x2F0,0x4000 },
 [VIRT_PCIE_PIO] = {  0x300,   0x1 },
@@ -424,6 +426,37 @@ static void create_fdt_socket_aclint(RISCVVirtState *s,
 }
 }

+static void create_fdt_socket_clic(RISCVVirtState *s,
+   const MemMapEntry *memmap, int socket)
+{
+g_autofree char *clic_name = NULL;
+g_autofree uint32_t *clic_cells = NULL;
+unsigned long mclicbase;
+MachineState *ms = MACHINE(s);
+static const char * const clic_compat[1] = {
+"riscv,clic-0.9"
+};
+
+/*
+ * The spec doesn't define a memory layout, other than to say that each
+ * CLIC should be on a 4KiB boundary if memory-mapped.
+ * This implementation makes all the CLICs contiguous, in the order M, S, 
U,
+ * and assumes the worst-case size.
+ * TODO: create entries for each CLIC on the system.
+ */
+mclicbase = memmap[VIRT_CLIC].base;
+clic_name = g_strdup_printf("/soc/clic@%lx", mclicbase);
+qemu_fdt_add_subnode(ms->fdt, clic_name);
+qemu_fdt_setprop_string_array(ms->fdt, clic_name, "compatible",
+  (char **)&clic_compat,
+  ARRAY_SIZE(clic_compat));
+qemu_fdt_setprop_cells(ms->fdt, clic_name, "regs",
+0x0, mclicbase, 0x0, memmap[VIRT_CLIC].size);
+qemu_fdt_setprop_cell(ms->fdt, clic_name, "riscv,num-sources",
+  VIRT_IRQCHIP_NUM_SOURCES);
+riscv_socket_fdt_write_id(ms, clic_name, socket);
+}
+
 static void create_fdt_socket_plic(RISCVVirtState *s,
const MemMapEntry *memmap, int socket,
uint32_t *phandle, uint32_t *intc_phandles,
@@ -759,7 +792,10 @@ static void create_fdt_sockets(RISCVVirtState *s, const 
MemMapEntry *memmap,

 create_fdt_socket_memory(s, memmap, socket);

-if (virt_aclint_allowed() && s->have_aclint) {
+
+if (s->have_clic) {
+create_fdt_socket_clic(s, memmap, socket);
+} else if (virt_aclint_allowed() && s->have_aclint) {
 create_fdt_socket_aclint(s, memmap, socket,
  &intc_phandles[phandle_pos]);
 } else if (tcg_enabled()) {
@@ -1206,6 +1242,37 @@ static DeviceState *virt_create_plic(const MemMapEntry 
*memmap, int socket,
 return ret;
 }

+static DeviceState *virt_create_clic(RISCVVirtState *s, uint64_t clic_base,
+ int hartid)
+{
+DeviceState *ret;
+uint32_t block_size = VIRT_CLIC_HART_SIZE(s->clic_prv_s, s->clic_prv_u);
+uint64_t mclicbase = clic_base + hartid * block_size;
+uint64_t sclicbase = 0;
+uint64_t uclicbase = 0;
+
+/*
+ * The spec doesn't define a memory layout, other than to say that each
+ * CLIC should be on a 4KiB boundary if memory-mapped.
+ * This implementation makes all the CLICs contiguous, in the order M, S, 
U.
+ */
+if (s->clic_prv_s) {
+sclicbase = mclicbase + VIRT_CLIC_BLOCK_SIZE;
+}
+if (s->clic_prv_u) {
+uclicbase = mclicbase + VIRT_CLIC_BLOCK_SIZE;
+if (s->clic_prv_s) {
+uclicbase += VIRT_CLIC_BLOCK_SIZE;
+}
+}
+ret = riscv_clic_create(mclicbase, sclicbase, uclicbase,
+hartid, VIRT_IRQCHIP_NUM_SOURCES,
+s->clic_intctlbits,
+s->clic_version);
+
+return ret;
+}
+
 static DeviceState *virt_create_aia(RISCVVirtAIAType aia_type, int aia_guests,
 const MemMapEntry *memmap, int socket,
 int base_hartid, int hart_count)
@@ -1505,7 +1572,7 @@ static void virt_machine_init(MachineState *machine)
 i * memmap[VIRT_ACLINT_SSWI].size,
 base_hartid, hart_count, true);
 }
-} else if (tcg_enabled(

[PATCH 11/11 v2] tests: add riscv clic qtest case and a function in qtest

2024-08-19 Thread Ian Brockbank
This adds riscv32-clic-test.c, containing qtest test cases for configuring
CLIC (via virt machine) and for triggering interrupts.

In order to detect the interrupts, qtest.c has been updated to send interrupt
information back to the test about the IRQ being delivered. Since we need to
both trigger and detect the interrupt, qtest has also been updated to allow
both an input and an output GPIO to be intercepted.

Signed-off-by: Troy Song 
Signed-off-by: Ian Brockbank 
---
 hw/intc/riscv_clic.c|4 +
 include/sysemu/qtest.h  |2 +
 system/qtest.c  |   72 +-
 tests/qtest/libqtest.c  |9 +
 tests/qtest/libqtest.h  |9 +
 tests/qtest/meson.build |3 +-
 tests/qtest/riscv32-clic-test.c | 1928 +++
 7 files changed, 2010 insertions(+), 17 deletions(-)
 create mode 100644 tests/qtest/riscv32-clic-test.c

diff --git a/hw/intc/riscv_clic.c b/hw/intc/riscv_clic.c
index 1800e84dfd..155ba65492 100644
--- a/hw/intc/riscv_clic.c
+++ b/hw/intc/riscv_clic.c
@@ -174,6 +174,10 @@ static void riscv_clic_next_interrupt(void *opaque)
 clic->clicintip[active->irq] = 0;
 }
 /* Post pending interrupt for this hart */
+if (qtest_enabled()) {
+qemu_set_irq(clic->cpu_irq, qtest_encode_irq(active->irq, 1));
+return;
+}
 clic->exccode = active->irq |
 mode << RISCV_EXCP_CLIC_MODE_SHIFT |
 level << RISCV_EXCP_CLIC_LEVEL_SHIFT;
diff --git a/include/sysemu/qtest.h b/include/sysemu/qtest.h
index c161d75165..1a34d27c6d 100644
--- a/include/sysemu/qtest.h
+++ b/include/sysemu/qtest.h
@@ -34,6 +34,8 @@ void qtest_server_init(const char *qtest_chrdev, const char 
*qtest_log, Error **
 void qtest_server_set_send_handler(void (*send)(void *, const char *),
  void *opaque);
 void qtest_server_inproc_recv(void *opaque, const char *buf);
+
+int qtest_encode_irq(int irqn, int level);
 #endif

 #endif
diff --git a/system/qtest.c b/system/qtest.c
index 12703a2045..0ba6e0fcbe 100644
--- a/system/qtest.c
+++ b/system/qtest.c
@@ -49,7 +49,8 @@ struct QTest {

 bool qtest_allowed;

-static DeviceState *irq_intercept_dev;
+static DeviceState *irq_intercept_dev_in;
+static DeviceState *irq_intercept_dev_out;
 static FILE *qtest_log_fp;
 static QTest *qtest;
 static GString *inbuf;
@@ -61,6 +62,14 @@ static void *qtest_server_send_opaque;

 #define FMT_timeval "%.06f"

+/*
+ * Encoding for passing the specific IRQ information from an interrupt handler
+ * to QTest. This needs to support CLIC, which has a 12-bit interrupt number.
+ */
+#define QTEST_IRQN  0x0fff
+#define QTEST_IRQN_SHIFT0
+#define QTEST_IRQ_LEVEL_SHIFT   12
+
 /**
  * DOC: QTest Protocol
  *
@@ -311,6 +320,18 @@ void qtest_sendf(CharBackend *chr, const char *fmt, ...)
 va_end(ap);
 }

+/* Encode the IRQ number and level for QTest */
+int qtest_encode_irq(int irqn, int level)
+{
+return (irqn & QTEST_IRQN) | (level << QTEST_IRQ_LEVEL_SHIFT);
+}
+
+static void qtest_decode_irq(int value, int *irqn, int *level)
+{
+*irqn = value & QTEST_IRQN;
+*level = value >> QTEST_IRQ_LEVEL_SHIFT;
+}
+
 static void qtest_irq_handler(void *opaque, int n, int level)
 {
 qemu_irq old_irq = *(qemu_irq *)opaque;
@@ -320,6 +341,16 @@ static void qtest_irq_handler(void *opaque, int n, int 
level)
 CharBackend *chr = &qtest->qtest_chr;
 irq_levels[n] = level;
 qtest_send_prefix(chr);
+if (level > 1) {
+int delivered_irq_num, pin_level;
+qtest_decode_irq(level, &delivered_irq_num, &pin_level);
+qtest_sendf(chr, "IRQ %s %d\n",
+"delivered", delivered_irq_num);
+qtest_send_prefix(chr);
+qtest_sendf(chr, "IRQ %s %d\n",
+pin_level ? "raise" : "lower", n);
+return;
+}
 qtest_sendf(chr, "IRQ %s %d\n",
 level ? "raise" : "lower", n);
 }
@@ -369,6 +400,7 @@ static void qtest_process_command(CharBackend *chr, gchar 
**words)
 bool is_named;
 bool is_outbound;
 bool interception_succeeded = false;
+bool interception_duplicated = false;

 g_assert(words[1]);
 is_named = words[2] != NULL;
@@ -386,38 +418,46 @@ static void qtest_process_command(CharBackend *chr, gchar 
**words)
 return;
 }

-if (irq_intercept_dev) {
-qtest_send_prefix(chr);
-if (irq_intercept_dev != dev) {
-qtest_send(chr, "FAIL IRQ intercept already enabled\n");
-} else {
-qtest_send(chr, "OK\n");
-}
-return;
-}
-
 QLIST_FOREACH(ngl, &dev->gpios, node) {
 /* We don't support inbound interception of named GPIOs yet */
 if (is_outbound) {
+   

Re: [PATCH v5 01/16] meson: Add optional dependency on IGVM library

2024-08-19 Thread Daniel P . Berrangé
On Tue, Aug 13, 2024 at 04:01:03PM +0100, Roy Hopkins wrote:
> The IGVM library allows Independent Guest Virtual Machine files to be
> parsed and processed. IGVM files are used to configure guest memory
> layout, initial processor state and other configuration pertaining to
> secure virtual machines.
> 
> This adds the --enable-igvm configure option, enabled by default, which
> attempts to locate and link against the IGVM library via pkgconfig and
> sets CONFIG_IGVM if found.
> 
> The library is added to the system_ss target in backends/meson.build
> where the IGVM parsing will be performed by the ConfidentialGuestSupport
> object.
> 
> Signed-off-by: Roy Hopkins 
> Acked-by: Michael S. Tsirkin 
> ---
>  backends/meson.build  | 3 +++
>  meson.build   | 8 
>  meson_options.txt | 2 ++
>  scripts/meson-buildoptions.sh | 3 +++
>  4 files changed, 16 insertions(+)

Reviewed-by: Daniel P. Berrangé 


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v4 4/6] machine/nitro-enclave: Add built-in Nitro Secure Module device

2024-08-19 Thread Dorjoy Chowdhury
On Mon, Aug 19, 2024 at 9:58 PM Alexander Graf  wrote:
>
>
> On 19.08.24 17:28, Dorjoy Chowdhury wrote:
> > Hey Alex,
> >
> > On Mon, Aug 19, 2024 at 4:13 PM Alexander Graf  wrote:
> >> Hey Dorjoy,
> >>
> >> On 18.08.24 13:42, Dorjoy Chowdhury wrote:
> >>> AWS Nitro Enclaves have built-in Nitro Secure Module (NSM) device which
> >>> is used for stripped down TPM functionality like attestation. This commit
> >>> adds the built-in NSM device in the nitro-enclave machine type.
> >>>
> >>> In Nitro Enclaves, all the PCRs start in a known zero state and the first
> >>> 16 PCRs are locked from boot and reserved. The PCR0, PCR1, PCR2 and PCR8
> >>> contain the SHA384 hashes related to the EIF file used to boot the
> >>> VM for validation.
> >>>
> >>> Some optional nitro-enclave machine options have been added:
> >>>   - 'id': Enclave identifier, reflected in the module-id of the NSM
> >>> device. If not provided, a default id will be set.
> >>>   - 'parent-role': Parent instance IAM role ARN, reflected in PCR3
> >>> of the NSM device.
> >>>   - 'parent-id': Parent instance identifier, reflected in PCR4 of the
> >>> NSM device.
> >>>
> >>> Signed-off-by: Dorjoy Chowdhury 
> >>> ---
> >>>crypto/meson.build  |   2 +-
> >>>crypto/x509-utils.c |  73 +++
> >>
> >> Can you please put this new API into its own patch file?
> >>
> >>
> >>>hw/core/eif.c   | 225 +---
> >>>hw/core/eif.h   |   5 +-
> >>
> >> These changes to eif.c should ideally already be part of the patch that
> >> introduces eif.c (patch 1), no? In fact, do you think you can make the
> >> whole eif logic its own patch file?
> >>
> > Good point. I guess it should be possible if I have the virtio-nsm
> > device commit first and then add the machine/nitro-enclave commit with
> > full support with the devices. That will of course make the
> > machine/nitro-enclave commit larger. What do you think?
>
>
> As long as nothing compiles the code, it can rely on not yet implemented
> functions. So it's perfectly legit to add all your code in individual
> commits and then at the end add the meson.build change that implements
> the config option. How about the order below?
>
> * Crypto patch for SHA384
> * Crypto patch for x509 fingerprint
> * NSM device emulation (including libcbor check, introduces
> CONFIG_VIRTIO_NSM)
> * EIF format parsing (not compiled yet)
> * Nitro Enclaves machine (introduces CONFIG_NITRO_ENCLAVE)
> * Nitro Enclaves docs
>

Sounds good to me. Thanks Alex!

Regards,
Dorjoy



[PATCH 03/11 v2] hw/intc: Add CLIC device

2024-08-19 Thread Ian Brockbank
From: Ian Brockbank 

The Core-Local Interrupt Controller (CLIC) provides low-latency,
vectored, pre-emptive interrupts for RISC-V systems.

The CLIC also supports a new Selective Hardware Vectoring feature
that allow users to optimize each interrupt for either faster
response or smaller code size.

Signed-off-by: LIU Zhiwei 
Signed-off-by: Ian Brockbank 
---
 hw/intc/Kconfig  |3 +
 hw/intc/meson.build  |3 +-
 hw/intc/riscv_clic.c | 1037 ++
 hw/riscv/Kconfig |1 +
 include/hw/intc/riscv_clic.h |  213 +++
 target/riscv/cpu.h   |2 +
 target/riscv/cpu_bits.h  |   17 +
 7 files changed, 1275 insertions(+), 1 deletion(-)
 create mode 100644 hw/intc/riscv_clic.c
 create mode 100644 include/hw/intc/riscv_clic.h

diff --git a/hw/intc/Kconfig b/hw/intc/Kconfig
index dd405bdb5d..1cd4c2f58c 100644
--- a/hw/intc/Kconfig
+++ b/hw/intc/Kconfig
@@ -81,6 +81,9 @@ config SIFIVE_PLIC
 bool
 select MSI_NONBROKEN

+config RISCV_CLIC
+bool
+
 config GOLDFISH_PIC
 bool

diff --git a/hw/intc/meson.build b/hw/intc/meson.build
index f4d81eb8e4..a9207dfb9e 100644
--- a/hw/intc/meson.build
+++ b/hw/intc/meson.build
@@ -59,9 +59,10 @@ specific_ss.add(when: 'CONFIG_S390_FLIC_KVM', if_true: 
files('s390_flic_kvm.c'))
 specific_ss.add(when: 'CONFIG_SH_INTC', if_true: files('sh_intc.c'))
 specific_ss.add(when: 'CONFIG_RISCV_ACLINT', if_true: files('riscv_aclint.c'))
 specific_ss.add(when: 'CONFIG_RISCV_APLIC', if_true: files('riscv_aplic.c'))
+specific_ss.add(when: 'CONFIG_RISCV_CLIC', if_true: files('riscv_clic.c'))
 specific_ss.add(when: 'CONFIG_RISCV_IMSIC', if_true: files('riscv_imsic.c'))
 specific_ss.add(when: 'CONFIG_SIFIVE_PLIC', if_true: files('sifive_plic.c'))
-specific_ss.add(when: 'CONFIG_XICS', if_true: files('xics.c', 'xive2.c'))
+specific_ss.add(when: 'CONFIG_XICS', if_true: files('xics.c'))
 specific_ss.add(when: ['CONFIG_KVM', 'CONFIG_XICS'],
if_true: files('xics_kvm.c'))
 specific_ss.add(when: 'CONFIG_PSERIES', if_true: files('xics_spapr.c', 
'spapr_xive.c'))
diff --git a/hw/intc/riscv_clic.c b/hw/intc/riscv_clic.c
new file mode 100644
index 00..1800e84dfd
--- /dev/null
+++ b/hw/intc/riscv_clic.c
@@ -0,0 +1,1037 @@
+/*
+ * RISC-V CLIC(Core Local Interrupt Controller) for QEMU.
+ *
+ * Copyright (c) 2016-2017 Sagar Karandikar, sag...@eecs.berkeley.edu
+ * Copyright (c) 2017-2018 SiFive, Inc.
+ * Copyright (c) 2021 T-Head Semiconductor Co., Ltd. All rights reserved.
+ * Copyright (c) 2024 Cirrus Logic, Inc and
+ *  Cirrus Logic International Semiconductor Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see .
+ *
+ * This implementation follows the CLIC 0.9-stable draft at 14 March 2024,
+ * with the following exceptions and implementation details:
+ *  - the CLIC control registers are memory-mapped as per earlier drafts (in
+ *particular version 0.9-draft, 20 June 2023)
+ *  - the indirect CSR control in 0.9-stable is not implemented
+ *  - the vector table can be either handler addresses (as per the spec)
+  or a jump table where each entry is processed as an instruction,
+  selectable with version number v0.9-jmp
+ *  - each hart is assigned its own CLIC block
+ *  - support for PRV_S and PRV_M is selectable at CLIC instantiation by
+ *passing in a base address for the given modes; a base address of 0 is
+ *treated as not supported
+ *  - PRV_S and PRV_U registers are mapped  onto the PRV_M controls with
+ *appropriate filtering for the access mode
+ *
+ * The implementation has a RISCVCLICState per hart, with a RISCVCLICView
+ * for each mode subsidiary to that. Each view knows its access mode and base
+ * address, as well as the RISCVCLICState with which it is associated.
+ *
+ * MMIO accesses go through the view, allowing the appropriate permissions to
+ * be enforced when accessing the parent RISCVCLICState for the settings.
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qemu/log.h"
+#include "qemu/main-loop.h"
+#include "hw/sysbus.h"
+#include "sysemu/qtest.h"
+#include "target/riscv/cpu.h"
+#include "hw/qdev-properties.h"
+#include "hw/intc/riscv_clic.h"
+
+static const char *modeview_name[] = {
+TYPE_RISCV_CLIC "_prv_u",   /* PRV_U */
+TYPE_RISCV_CLIC "_prv_s",   /* PRV_S */
+NULL,   /* reserved */
+TYPE_RISCV_CLIC "_prv

[PATCH 07/11 v2] target/riscv: Update CSR xnxti in CLIC mode

2024-08-19 Thread Ian Brockbank
From: Ian Brockbank 

The CSR can be used by software to service the next horizontal interrupt
when it has greater level than the saved interrupt context
(held in xcause`.pil`) and greater level than the interrupt threshold of
the corresponding privilege mode,

Signed-off-by: LIU Zhiwei 
Signed-off-by: Ian Brockbank 
---
 target/riscv/cpu_bits.h |  25 +
 target/riscv/csr.c  | 111 
 2 files changed, 136 insertions(+)

diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h
index 279a6f889b..3744b34504 100644
--- a/target/riscv/cpu_bits.h
+++ b/target/riscv/cpu_bits.h
@@ -166,6 +166,7 @@
 #define CSR_MCAUSE  0x342
 #define CSR_MTVAL   0x343
 #define CSR_MIP 0x344
+#define CSR_MNXTI   0x345 /* clic-spec-draft */
 #define CSR_MINTSTATUS  0xfb1 /* clic-spec-draft */
 #define CSR_MINTTHRESH  0x347 /* clic-spec-draft */

@@ -210,6 +211,7 @@
 #define CSR_SCAUSE  0x142
 #define CSR_STVAL   0x143
 #define CSR_SIP 0x144
+#define CSR_SNXTI   0x145 /* clic-spec-draft */
 #define CSR_SINTSTATUS  0xdb1 /* clic-spec-draft */
 #define CSR_SINTTHRESH  0x147 /* clic-spec-draft */

@@ -561,6 +563,8 @@
 #define MSTATUS_GVA 0x40ULL
 #define MSTATUS_MPV 0x80ULL

+#define MSTATUS_WRITE_MASK  0x001f
+
 #define MSTATUS64_UXL   0x0003ULL
 #define MSTATUS64_SXL   0x000CULL

@@ -754,6 +758,27 @@ typedef enum RISCVException {
 #define SINTSTATUS_SIL 0xff00 /* sil[15:8] */
 #define SINTSTATUS_UIL 0x00ff /* uil[7:0] */

+/* mcause */
+#define MCAUSE_INT (1 << (TARGET_LONG_BITS - 1))
+#define MCAUSE_MINHV   0x4000 /* minhv */
+#define MCAUSE_MPP 0x3000 /* mpp[1:0] */
+#define MCAUSE_MPIE0x0800 /* mpie */
+#define MCAUSE_MPIL0x00ff /* mpil[7:0] */
+#define MCAUSE_EXCCODE 0x0fff /* exccode[11:0] */
+
+/* scause */
+#define SCAUSE_INT (1 << (TARGET_LONG_BITS - 1))
+#define SCAUSE_SINHV   0x4000 /* sinhv */
+#define SCAUSE_SPP 0x1000 /* spp */
+#define SCAUSE_SPIE0x0800 /* spie */
+#define SCAUSE_SPIL0x00ff /* spil[7:0] */
+#define SCAUSE_EXCCODE 0x0fff /* exccode[11:0] */
+
+/* mcause & scause */
+#define XCAUSE_XPP_SHIFT   28
+#define XCAUSE_XPIE_SHIFT  27
+#define XCAUSE_XPIL_SHIFT  16
+
 /* mtvec & stvec */
 #define XTVEC_MODE 0x03
 #define XTVEC_SUBMODE  0x3c
diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index be0071fd25..813a5b927f 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -19,6 +19,7 @@

 #include "qemu/osdep.h"
 #include "qemu/log.h"
+#include "qemu/main-loop.h"
 #include "qemu/timer.h"
 #include "cpu.h"
 #include "tcg/tcg-cpu.h"
@@ -2936,6 +2937,77 @@ static RISCVException rmw_mviph(CPURISCVState *env, int 
csrno,
 return ret;
 }

+static bool get_xnxti_status(CPURISCVState *env)
+{
+int clic_irq, clic_priv, clic_il, pil;
+
+if (!env->exccode) { /* No interrupt */
+return false;
+}
+/* The system is not in a CLIC mode */
+if (!riscv_clic_is_clic_mode(env)) {
+return false;
+} else {
+riscv_clic_decode_exccode(env->exccode, &clic_priv, &clic_il,
+  &clic_irq);
+
+if (env->priv == PRV_M) {
+pil = MAX(get_field(env->mcause, MCAUSE_MPIL), env->mintthresh);
+} else if (env->priv == PRV_S) {
+pil = MAX(get_field(env->scause, SCAUSE_SPIL), env->sintthresh);
+} else {
+qemu_log_mask(LOG_GUEST_ERROR,
+  "CSR: rmw xnxti with unsupported mode\n");
+exit(1);
+}
+
+if ((clic_priv != env->priv) || /* No horizontal interrupt */
+(clic_il <= pil) || /* No higher level interrupt */
+(riscv_clic_shv_interrupt(env->clic, clic_irq))) {
+/* CLIC vector mode */
+return false;
+} else {
+return true;
+}
+}
+}
+
+static int rmw_mnxti(CPURISCVState *env, int csrno, target_ulong *ret_value,
+ target_ulong new_value, target_ulong write_mask)
+{
+int clic_priv, clic_il, clic_irq;
+bool ready;
+if (write_mask) {
+env->mstatus |= new_value & (write_mask & MSTATUS_WRITE_MASK);
+}
+
+BQL_LOCK_GUARD();
+
+ready = get_xnxti_status(env);
+if (ready) {
+riscv_clic_decode_exccode(env->exccode, &clic_priv, &clic_il,
+  &clic_irq);
+if (write_mask) {
+bool edge = riscv_clic_edge_triggered(env->clic,  clic_irq

[PATCH 08/11 v2] target/riscv: Update interrupt handling in CLIC mode

2024-08-19 Thread Ian Brockbank
From: Ian Brockbank 

Decode CLIC interrupt information from exccode, includes interrupt
privilege mode, interrupt level, and irq number.

Then update CSRs xcause, xstatus, xepc, xintstatus and jump to
correct PC according to the CLIC specification.

Signed-off-by: LIU Zhiwei 
Signed-off-by: Ian Brockbank 
---
 target/riscv/cpu_helper.c | 129 +++---
 1 file changed, 119 insertions(+), 10 deletions(-)

diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index 395a1d9140..944afb68d2 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -24,6 +24,7 @@
 #include "internals.h"
 #include "pmu.h"
 #include "exec/exec-all.h"
+#include "exec/cpu_ldst.h"
 #include "exec/page-protection.h"
 #include "instmap.h"
 #include "tcg/tcg-op.h"
@@ -33,6 +34,7 @@
 #include "cpu_bits.h"
 #include "debug.h"
 #include "tcg/oversized-guest.h"
+#include "hw/intc/riscv_clic.h"

 int riscv_env_mmu_index(CPURISCVState *env, bool ifetch)
 {
@@ -428,6 +430,20 @@ int riscv_cpu_vsirq_pending(CPURISCVState *env)
 (irqs | irqs_f_vs), env->hviprio);
 }

+static int riscv_cpu_local_irq_mode_enabled(CPURISCVState *env, int mode)
+{
+switch (mode) {
+case PRV_M:
+return env->priv < PRV_M ||
+(env->priv == PRV_M && get_field(env->mstatus, MSTATUS_MIE));
+case PRV_S:
+return env->priv < PRV_S ||
+(env->priv == PRV_S && get_field(env->mstatus, MSTATUS_SIE));
+default:
+return false;
+}
+}
+
 static int riscv_cpu_local_irq_pending(CPURISCVState *env)
 {
 uint64_t irqs, pending, mie, hsie, vsie, irqs_f, irqs_f_vs;
@@ -506,6 +522,18 @@ bool riscv_cpu_exec_interrupt(CPUState *cs, int 
interrupt_request)
 return true;
 }
 }
+if (interrupt_request & CPU_INTERRUPT_CLIC) {
+RISCVCPU *cpu = RISCV_CPU(cs);
+CPURISCVState *env = &cpu->env;
+int mode = get_field(env->exccode, RISCV_EXCP_CLIC_MODE);
+int enabled = riscv_cpu_local_irq_mode_enabled(env, mode);
+if (enabled) {
+cs->exception_index = RISCV_EXCP_CLIC | env->exccode;
+cs->interrupt_request = cs->interrupt_request & 
~CPU_INTERRUPT_CLIC;
+riscv_cpu_do_interrupt(cs);
+return true;
+}
+}
 return false;
 }

@@ -1641,6 +1669,60 @@ static target_ulong riscv_transformed_insn(CPURISCVState 
*env,
 return xinsn;
 }

+static target_ulong riscv_intr_pc(CPURISCVState *env, target_ulong tvec,
+  target_ulong tvt, bool async,
+  int cause, int mode)
+{
+int mode1 = tvec & XTVEC_MODE;
+int mode2 = tvec & XTVEC_FULL_MODE;
+
+if (!async) {
+return tvec & XTVEC_OBASE;
+}
+/* bits [1:0] encode mode; 0 = direct, 1 = vectored, 2 >= reserved */
+switch (mode1) {
+case XTVEC_CLINT_DIRECT:
+return tvec & XTVEC_OBASE;
+case XTVEC_CLINT_VECTORED:
+return (tvec & XTVEC_OBASE) + cause * 4;
+default:
+if (env->clic && (mode2 == XTVEC_CLIC)) {
+/* Non-vectored, clicintattr[i].shv = 0 || cliccfg.nvbits = 0 */
+if (!riscv_clic_shv_interrupt(env->clic, cause)) {
+/* NBASE = mtvec[XLEN-1:6]<<6 */
+return tvec & XTVEC_NBASE;
+} else {
+/*
+ * pc := M[TBASE + XLEN/8 * exccode)] & ~1,
+ * TBASE = mtvt[XLEN-1:6]<<6
+ */
+int size = TARGET_LONG_BITS / 8;
+target_ulong tbase = (tvt & XTVEC_NBASE) + size * cause;
+void *host = tlb_vaddr_to_host(env, tbase, MMU_DATA_LOAD, 
mode);
+if (host != NULL) {
+target_ulong new_pc = tbase;
+if (!riscv_clic_use_jump_table(env->clic)) {
+/*
+ * Standard CLIC: the vector entry is a function 
pointer
+ * so look up the destination.
+ */
+new_pc = ldn_p(host, size);
+host = tlb_vaddr_to_host(env, new_pc,
+ MMU_INST_FETCH, mode);
+}
+if (host) {
+return new_pc;
+}
+}
+qemu_log_mask(LOG_GUEST_ERROR,
+  "CLIC: load trap handler error!\n");
+exit(1);
+}
+}
+g_assert_not_reached();
+}
+}
+
 /*
  * Handle Traps
  *
@@ -1654,12 +1736,14 @@ void riscv_cpu_do_interrupt(CPUState *cs)
 bool virt = env->virt_enabled;
 bool write_gva = false;
 uint64_t s;
+int mode, level, irq;

 /*
  * cs->exception is 32-bits wide unlike mcause which is XLEN-bits wide
  * so we mask off the MSB and separate into trap type and cause.
  */
-bool async = !!(

[PATCH 06/11 v2] target/riscv: Update CSR xtvec in CLIC mode

2024-08-19 Thread Ian Brockbank
From: Ian Brockbank 

The new CLIC interrupt-handling mode is encoded as a new state in the
existing WARL xtvec register, where the low two bits of are 11.

Signed-off-by: LIU Zhiwei 
Signed-off-by: Ian Brockbank 
---
 target/riscv/cpu.h  |  2 ++
 target/riscv/cpu_bits.h |  2 ++
 target/riscv/csr.c  | 63 ++---
 3 files changed, 63 insertions(+), 4 deletions(-)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 12aa8cf6b1..05a014db03 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -283,11 +283,13 @@ struct CPUArchState {
 target_ulong medeleg;

 target_ulong stvec;
+target_ulong stvt; /* clic-spec */
 target_ulong sepc;
 target_ulong scause;
 target_ulong sintthresh; /* clic-spec */

 target_ulong mtvec;
+target_ulong mtvt; /* clic-spec */
 target_ulong mepc;
 target_ulong mcause;
 target_ulong mtval;  /* since: priv-1.10.0 */
diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h
index 0ed44ec0a8..279a6f889b 100644
--- a/target/riscv/cpu_bits.h
+++ b/target/riscv/cpu_bits.h
@@ -153,6 +153,7 @@
 #define CSR_MIE 0x304
 #define CSR_MTVEC   0x305
 #define CSR_MCOUNTEREN  0x306
+#define CSR_MTVT0x307 /* clic-spec-draft */

 /* 32-bit only */
 #define CSR_MSTATUSH0x310
@@ -192,6 +193,7 @@
 #define CSR_SIE 0x104
 #define CSR_STVEC   0x105
 #define CSR_SCOUNTEREN  0x106
+#define CSR_STVT0x107 /* clic-spec-draft */

 /* Supervisor Configuration CSRs */
 #define CSR_SENVCFG 0x10A
diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index 276ef7856e..be0071fd25 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -2170,9 +2170,23 @@ static RISCVException read_mtvec(CPURISCVState *env, int 
csrno,
 static RISCVException write_mtvec(CPURISCVState *env, int csrno,
   target_ulong val)
 {
-/* bits [1:0] encode mode; 0 = direct, 1 = vectored, 2 >= reserved */
-if ((val & 3) < 2) {
+/*
+ * bits [1:0] encode mode; 0 = direct, 1 = vectored, 3 = CLIC,
+ * others reserved
+ */
+target_ulong mode = get_field(val, XTVEC_MODE);
+target_ulong fullmode = val & XTVEC_FULL_MODE;
+if (mode <= XTVEC_CLINT_VECTORED) {
 env->mtvec = val;
+} else if (XTVEC_CLIC == fullmode && env->clic) {
+/*
+ * CLIC mode hardwires xtvec bits 2-5 to zero.
+ * Layout:
+ *   XLEN-1:6   base (WARL)
+ *   5:2submode (WARL)  -  for CLIC
+ *   1:0mode (WARL) - 11 for CLIC
+ */
+env->mtvec = (val & XTVEC_NBASE) | XTVEC_CLIC;
 } else {
 qemu_log_mask(LOG_UNIMP, "CSR_MTVEC: reserved mode not supported\n");
 }
@@ -2271,6 +2285,18 @@ static RISCVException write_mcounteren(CPURISCVState 
*env, int csrno,
 return RISCV_EXCP_NONE;
 }

+static int read_mtvt(CPURISCVState *env, int csrno, target_ulong *val)
+{
+*val = env->mtvt;
+return RISCV_EXCP_NONE;
+}
+
+static int write_mtvt(CPURISCVState *env, int csrno, target_ulong val)
+{
+env->mtvt = val & XTVEC_NBASE;
+return RISCV_EXCP_NONE;
+}
+
 /* Machine Trap Handling */
 static RISCVException read_mscratch_i128(CPURISCVState *env, int csrno,
  Int128 *val)
@@ -3122,9 +3148,24 @@ static RISCVException read_stvec(CPURISCVState *env, int 
csrno,
 static RISCVException write_stvec(CPURISCVState *env, int csrno,
   target_ulong val)
 {
-/* bits [1:0] encode mode; 0 = direct, 1 = vectored, 2 >= reserved */
-if ((val & 3) < 2) {
+/*
+ * bits [1:0] encode mode; 0 = direct, 1 = vectored, 3 = CLIC,
+ * others reserved
+ */
+target_ulong mode = val & XTVEC_MODE;
+target_ulong fullmode = val & XTVEC_FULL_MODE;
+if (mode <= XTVEC_CLINT_VECTORED) {
 env->stvec = val;
+} else if (XTVEC_CLIC == fullmode && env->clic) {
+/*
+ * If only CLIC mode is supported, writes to bit 1 are also ignored and
+ * it is always set to one. CLIC mode hardwires xtvec bits 2-5 to zero.
+ * Layout:
+ *   XLEN-1:6   base (WARL)
+ *   5:2submode (WARL)  -  for CLIC
+ *   1:0mode (WARL) - 11 for CLIC
+ */
+env->stvec = (val & XTVEC_NBASE) | XTVEC_CLIC;
 } else {
 qemu_log_mask(LOG_UNIMP, "CSR_STVEC: reserved mode not supported\n");
 }
@@ -3149,6 +3190,18 @@ static RISCVException write_scounteren(CPURISCVState 
*env, int csrno,
 return RISCV_EXCP_NONE;
 }

+static int read_stvt(CPURISCVState *env, int csrno, target_ulong *val)
+{
+*val = env->stvt;
+return RISCV_EXCP_NONE;
+}
+
+static int write_stvt(CPURISCVState *env, int csrno, target_ulong val)
+{
+env->stvt = val & XTVEC_NBASE;
+return RISCV_EXCP_NONE;
+}
+
 /* Supervisor Trap Handling */
 static RISCVException read_sscratch_i128(CPURISCVState *env,

[PATCH 09/11 v2] target/riscv: Update interrupt return in CLIC mode

2024-08-19 Thread Ian Brockbank
From: Ian Brockbank 

When a vectored interrupt is selected and serviced, the hardware will
automatically clear the corresponding pending bit in edge-triggered mode.
This may lead to a lower privilege interrupt pending forever.

Therefore when interrupts return, pull a pending interrupt to service.

Signed-off-by: LIU Zhiwei 
Signed-off-by: Ian Brockbank 
---
 target/riscv/op_helper.c | 29 +++--
 1 file changed, 27 insertions(+), 2 deletions(-)

diff --git a/target/riscv/op_helper.c b/target/riscv/op_helper.c
index 25a5263573..b6ca3ad598 100644
--- a/target/riscv/op_helper.c
+++ b/target/riscv/op_helper.c
@@ -25,7 +25,11 @@
 #include "exec/cpu_ldst.h"
 #include "exec/helper-proto.h"

-/* Exceptions processing helpers */
+#if !defined(CONFIG_USER_ONLY)
+#include "hw/intc/riscv_clic.h"
+#endif
+
+/* Exception processing helpers */
 G_NORETURN void riscv_raise_exception(CPURISCVState *env,
   uint32_t exception, uintptr_t pc)
 {
@@ -259,6 +263,7 @@ void helper_cbo_inval(CPURISCVState *env, target_ulong 
address)

 #ifndef CONFIG_USER_ONLY

+/* Return from PRV_S interrupt */
 target_ulong helper_sret(CPURISCVState *env)
 {
 uint64_t mstatus;
@@ -292,8 +297,17 @@ target_ulong helper_sret(CPURISCVState *env)
 }
 env->mstatus = mstatus;

+if (riscv_clic_is_clic_mode(env)) {
+/* Update mintstatus with the PRV_S information */
+target_ulong spil = get_field(env->scause, SCAUSE_SPIL);
+env->mintstatus = set_field(env->mintstatus, MINTSTATUS_SIL, spil);
+env->scause = set_field(env->scause, SCAUSE_SPIE, 1);
+env->scause = set_field(env->scause, SCAUSE_SPP, PRV_U);
+riscv_clic_get_next_interrupt(env->clic);
+}
+
 if (riscv_has_ext(env, RVH) && !env->virt_enabled) {
-/* We support Hypervisor extensions and virtulisation is disabled */
+/* We support Hypervisor extensions and virtualization is disabled */
 target_ulong hstatus = env->hstatus;

 prev_virt = get_field(hstatus, HSTATUS_SPV);
@@ -312,6 +326,7 @@ target_ulong helper_sret(CPURISCVState *env)
 return retpc;
 }

+/* Return from PRV_M interrupt */
 target_ulong helper_mret(CPURISCVState *env)
 {
 if (!(env->priv >= PRV_M)) {
@@ -344,6 +359,16 @@ target_ulong helper_mret(CPURISCVState *env)
 }
 env->mstatus = mstatus;

+if (riscv_clic_is_clic_mode(env)) {
+/* Update mintstatus with the PRV_M information */
+target_ulong mpil = get_field(env->mcause, MCAUSE_MPIL);
+env->mintstatus = set_field(env->mintstatus, MINTSTATUS_MIL, mpil);
+env->mcause = set_field(env->mcause, MCAUSE_MPIE, 1);
+env->mcause = set_field(env->mcause, MCAUSE_MPP,
+riscv_has_ext(env, RVU) ? PRV_U : PRV_M);
+riscv_clic_get_next_interrupt(env->clic);
+}
+
 if (riscv_has_ext(env, RVH) && prev_virt) {
 riscv_cpu_swap_hypervisor_regs(env);
 }
--
2.46.0.windows.1
This message and any attachments may contain privileged and confidential 
information that is intended solely for the person(s) to whom it is addressed. 
If you are not an intended recipient you must not: read; copy; distribute; 
discuss; take any action in or make any reliance upon the contents of this 
message; nor open or read any attachment. If you have received this message in 
error, please notify us as soon as possible on the following telephone number 
and destroy this message including any attachments. Thank you. Cirrus Logic 
International (UK) Ltd and Cirrus Logic International Semiconductor Ltd are 
companies registered in Scotland, with registered numbers SC089839 and SC495735 
respectively. Our registered office is at 7B Nightingale Way, Quartermile, 
Edinburgh, EH3 9EG, UK. Tel: +44 (0)131 272 7000. www.cirrus.com



[PATCH 05/11 v2] target/riscv: Update CSR xip in CLIC mode

2024-08-19 Thread Ian Brockbank
From: Ian Brockbank 

The xip CSR appears hardwired to zero in CLIC mode, replaced by separate
memory-mapped interrupt pendings (clicintip[i]). Writes to xip will be
ignored and will not trap (i.e., no access faults).

Signed-off-by: LIU Zhiwei 
Signed-off-by: Ian Brockbank 
---
 target/riscv/csr.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index a5978e0929..276ef7856e 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -2743,6 +2743,12 @@ static RISCVException rmw_mip(CPURISCVState *env, int 
csrno,
 uint64_t rval;
 RISCVException ret;

+/* The xip CSR appears hardwired to zero in CLIC mode. */
+if (riscv_clic_is_clic_mode(env)) {
+*ret_val = 0;
+return RISCV_EXCP_NONE;
+}
+
 ret = rmw_mip64(env, csrno, &rval, new_val, wr_mask);
 if (ret_val) {
 *ret_val = rval;
@@ -3294,6 +3300,12 @@ static RISCVException rmw_sip64(CPURISCVState *env, int 
csrno,
 }
 ret = rmw_vsip64(env, CSR_VSIP, ret_val, new_val, wr_mask);
 } else {
+/* The xip CSR appears hardwired to zero in CLIC mode. */
+if (riscv_clic_is_clic_mode(env)) {
+*ret_val = 0;
+return RISCV_EXCP_NONE;
+}
+
 ret = rmw_mvip64(env, csrno, ret_val, new_val, wr_mask & mask);
 }

--
2.46.0.windows.1
This message and any attachments may contain privileged and confidential 
information that is intended solely for the person(s) to whom it is addressed. 
If you are not an intended recipient you must not: read; copy; distribute; 
discuss; take any action in or make any reliance upon the contents of this 
message; nor open or read any attachment. If you have received this message in 
error, please notify us as soon as possible on the following telephone number 
and destroy this message including any attachments. Thank you. Cirrus Logic 
International (UK) Ltd and Cirrus Logic International Semiconductor Ltd are 
companies registered in Scotland, with registered numbers SC089839 and SC495735 
respectively. Our registered office is at 7B Nightingale Way, Quartermile, 
Edinburgh, EH3 9EG, UK. Tel: +44 (0)131 272 7000. www.cirrus.com



Re: [PATCH v4 4/6] machine/nitro-enclave: Add built-in Nitro Secure Module device

2024-08-19 Thread Dorjoy Chowdhury
On Mon, Aug 19, 2024 at 10:10 PM Daniel P. Berrangé  wrote:
>
> On Mon, Aug 19, 2024 at 10:07:02PM +0600, Dorjoy Chowdhury wrote:
> > On Mon, Aug 19, 2024 at 9:53 PM Daniel P. Berrangé  
> > wrote:
> > >
> > > On Mon, Aug 19, 2024 at 09:32:55PM +0600, Dorjoy Chowdhury wrote:
> > > > On Mon, Aug 19, 2024 at 4:13 PM Alexander Graf  wrote:
> > > > >
> > > > > Hey Dorjoy,
> > > > >
> > > > > On 18.08.24 13:42, Dorjoy Chowdhury wrote:
> > > > > > AWS Nitro Enclaves have built-in Nitro Secure Module (NSM) device 
> > > > > > which
> > > > > > is used for stripped down TPM functionality like attestation. This 
> > > > > > commit
> > > > > > adds the built-in NSM device in the nitro-enclave machine type.
> > > > > >
> > > > > > In Nitro Enclaves, all the PCRs start in a known zero state and the 
> > > > > > first
> > > > > > 16 PCRs are locked from boot and reserved. The PCR0, PCR1, PCR2 and 
> > > > > > PCR8
> > > > > > contain the SHA384 hashes related to the EIF file used to boot the
> > > > > > VM for validation.
> > > > > >
> > > > > > Some optional nitro-enclave machine options have been added:
> > > > > >  - 'id': Enclave identifier, reflected in the module-id of the 
> > > > > > NSM
> > > > > > device. If not provided, a default id will be set.
> > > > > >  - 'parent-role': Parent instance IAM role ARN, reflected in 
> > > > > > PCR3
> > > > > > of the NSM device.
> > > > > >  - 'parent-id': Parent instance identifier, reflected in PCR4 
> > > > > > of the
> > > > > > NSM device.
> > > > > >
> > > > > > Signed-off-by: Dorjoy Chowdhury 
> > > > > > ---
> > > > > >   crypto/meson.build  |   2 +-
> > > > > >   crypto/x509-utils.c |  73 +++
> > > > >
> > > > >
> > > > > Can you please put this new API into its own patch file?
> > > > >
> > > > >
> > > > > >   hw/core/eif.c   | 225 
> > > > > > +---
> > > > > >   hw/core/eif.h   |   5 +-
> > > > >
> > > > >
> > > > > These changes to eif.c should ideally already be part of the patch 
> > > > > that
> > > > > introduces eif.c (patch 1), no? In fact, do you think you can make the
> > > > > whole eif logic its own patch file?
> > > > >
> > > > >
> > > > > >   hw/core/meson.build |   4 +-
> > > > > >   hw/i386/Kconfig |   1 +
> > > > > >   hw/i386/nitro_enclave.c | 141 +++-
> > > > > >   include/crypto/x509-utils.h |  22 
> > > > > >   include/hw/i386/nitro_enclave.h |  26 
> > > > > >   9 files changed, 479 insertions(+), 20 deletions(-)
> > > > > >   create mode 100644 crypto/x509-utils.c
> > > > > >   create mode 100644 include/crypto/x509-utils.h
> > > > > >
> > > > > > diff --git a/crypto/meson.build b/crypto/meson.build
> > > > > > index c46f9c22a7..09633194ed 100644
> > > > > > --- a/crypto/meson.build
> > > > > > +++ b/crypto/meson.build
> > > > > > @@ -62,7 +62,7 @@ endif
> > > > > >   if gcrypt.found()
> > > > > > util_ss.add(gcrypt, files('random-gcrypt.c'))
> > > > > >   elif gnutls.found()
> > > > > > -  util_ss.add(gnutls, files('random-gnutls.c'))
> > > > > > +  util_ss.add(gnutls, files('random-gnutls.c', 'x509-utils.c'))
> > > > >
> > > > >
> > > > > What if we don't have gnutls. Will everything still compile or do we
> > > > > need to add any dependencies?
> > > > >
> > > > >
> > > >
> > > > [...]
> > > >
> > > > > >
> > > > > > diff --git a/hw/core/meson.build b/hw/core/meson.build
> > > > > > index f32d1ad943..8dc4552e35 100644
> > > > > > --- a/hw/core/meson.build
> > > > > > +++ b/hw/core/meson.build
> > > > > > @@ -12,6 +12,8 @@ hwcore_ss.add(files(
> > > > > > 'qdev-clock.c',
> > > > > >   ))
> > > > > >
> > > > > > +libcbor = dependency('libcbor', version: '>=0.7.0')
> > > > > > +
> > > > > >   common_ss.add(files('cpu-common.c'))
> > > > > >   common_ss.add(files('machine-smp.c'))
> > > > > >   system_ss.add(when: 'CONFIG_FITLOADER', if_true: 
> > > > > > files('loader-fit.c'))
> > > > > > @@ -24,7 +26,7 @@ system_ss.add(when: 'CONFIG_REGISTER', if_true: 
> > > > > > files('register.c'))
> > > > > >   system_ss.add(when: 'CONFIG_SPLIT_IRQ', if_true: 
> > > > > > files('split-irq.c'))
> > > > > >   system_ss.add(when: 'CONFIG_XILINX_AXI', if_true: 
> > > > > > files('stream.c'))
> > > > > >   system_ss.add(when: 'CONFIG_PLATFORM_BUS', if_true: 
> > > > > > files('sysbus-fdt.c'))
> > > > > > -system_ss.add(when: 'CONFIG_NITRO_ENCLAVE', if_true: 
> > > > > > [files('eif.c'), zlib])
> > > > > > +system_ss.add(when: 'CONFIG_NITRO_ENCLAVE', if_true: 
> > > > > > [files('eif.c'), zlib, libcbor, gnutls])
> > > > >
> > > > >
> > > > > Ah, you add the gnutls dependency here. Great! However, this means we
> > > > > now make gnutls (and libcbor) a mandatory dependency for the default
> > > > > configuration. Does configure know about that? I believe before gnutls
> > > > > was optional, right?
> > > > >
> > > >
> > > > I see gnutls is not a required dependency in the root meson.

[PATCH 04/11 v2] target/riscv: Update CSR xie in CLIC mode

2024-08-19 Thread Ian Brockbank
From: Ian Brockbank 

The xie CSR appears hardwired to zero in CLIC mode, replaced by separate
memory-mapped interrupt enables (clicintie[i]). Writes to xie will be
ignored and will not trap (i.e., no access faults).

Signed-off-by: LIU Zhiwei 
Signed-off-by: Ian Brockbank 
---
 target/riscv/csr.c | 34 ++
 1 file changed, 22 insertions(+), 12 deletions(-)

diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index 9c824c0d8f..a5978e0929 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -30,6 +30,10 @@
 #include "qemu/guest-random.h"
 #include "qapi/error.h"

+#if !defined(CONFIG_USER_ONLY)
+#include "hw/intc/riscv_clic.h"
+#endif
+
 /* CSR function table public API */
 void riscv_get_csr_ops(int csrno, riscv_csr_operations *ops)
 {
@@ -1805,16 +1809,19 @@ static RISCVException rmw_mie64(CPURISCVState *env, int 
csrno,
 uint64_t *ret_val,
 uint64_t new_val, uint64_t wr_mask)
 {
-uint64_t mask = wr_mask & all_ints;
+/* Access to xie will be ignored in CLIC mode and will not trap. */
+if (!riscv_clic_is_clic_mode(env)) {
+uint64_t mask = wr_mask & all_ints;

-if (ret_val) {
-*ret_val = env->mie;
-}
+if (ret_val) {
+*ret_val = env->mie;
+}

-env->mie = (env->mie & ~mask) | (new_val & mask);
+env->mie = (env->mie & ~mask) | (new_val & mask);

-if (!riscv_has_ext(env, RVH)) {
-env->mie &= ~((uint64_t)HS_MODE_INTERRUPTS);
+if (!riscv_has_ext(env, RVH)) {
+env->mie &= ~((uint64_t)HS_MODE_INTERRUPTS);
+}
 }

 return RISCV_EXCP_NONE;
@@ -2906,13 +2913,13 @@ static int read_mintstatus(CPURISCVState *env, int 
csrno, target_ulong *val)
 static int read_mintthresh(CPURISCVState *env, int csrno, target_ulong *val)
 {
 *val = env->mintthresh;
-return 0;
+return RISCV_EXCP_NONE;
 }

 static int write_mintthresh(CPURISCVState *env, int csrno, target_ulong val)
 {
 env->mintthresh = val;
-return 0;
+return RISCV_EXCP_NONE;
 }

 /* Supervisor Trap Setup */
@@ -3059,7 +3066,10 @@ static RISCVException rmw_sie64(CPURISCVState *env, int 
csrno,
 *ret_val |= env->sie & nalias_mask;
 }

-env->sie = (env->sie & ~sie_mask) | (new_val & sie_mask);
+/* Writes to xie will be ignored in CLIC mode and will not trap. */
+if (!riscv_clic_is_clic_mode(env)) {
+env->sie = (env->sie & ~sie_mask) | (new_val & sie_mask);
+}
 }

 return ret;
@@ -3337,13 +3347,13 @@ static int read_sintstatus(CPURISCVState *env, int 
csrno, target_ulong *val)
 static int read_sintthresh(CPURISCVState *env, int csrno, target_ulong *val)
 {
 *val = env->sintthresh;
-return 0;
+return RISCV_EXCP_NONE;
 }

 static int write_sintthresh(CPURISCVState *env, int csrno, target_ulong val)
 {
 env->sintthresh = val;
-return 0;
+return RISCV_EXCP_NONE;
 }

 /* Supervisor Protection and Translation */
--
2.46.0.windows.1
This message and any attachments may contain privileged and confidential 
information that is intended solely for the person(s) to whom it is addressed. 
If you are not an intended recipient you must not: read; copy; distribute; 
discuss; take any action in or make any reliance upon the contents of this 
message; nor open or read any attachment. If you have received this message in 
error, please notify us as soon as possible on the following telephone number 
and destroy this message including any attachments. Thank you. Cirrus Logic 
International (UK) Ltd and Cirrus Logic International Semiconductor Ltd are 
companies registered in Scotland, with registered numbers SC089839 and SC495735 
respectively. Our registered office is at 7B Nightingale Way, Quartermile, 
Edinburgh, EH3 9EG, UK. Tel: +44 (0)131 272 7000. www.cirrus.com



apparent memory leak from object-add+object-del of memory-backend-ram

2024-08-19 Thread Peter Maydell
Hi; I'm looking at a memory leak apparently in the host memory backend
code that you can see from the qmp-cmd-test. Repro instructions:

(1) build QEMU with '--cc=clang' '--cxx=clang++' '--enable-debug'
'--target-list=x86_64-softmmu' '--enable-sanitizers'
(2) run 'make check'. More specifically, to get just this
failure ('make check' on current head-of-tree produces some
other unrelated leak errors) you can run the relevant single test:

(cd build/asan && ASAN_OPTIONS="fast_unwind_on_malloc=0"
QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/qmp-cmd-test
--tap -k -p /x86_64/qmp/object-add-failure-modes)

The test case is doing a variety of object-add then object-del
of the "memory-backend-ram" object, and this add-del cycle seems
to result in a fairly large leak:

Direct leak of 1572864 byte(s) in 6 object(s) allocated from:
#0 0x555c1336efd8 in __interceptor_calloc
(/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qemu-system-x86_64+0x218efd8)
(BuildId: fc7566a39db1253aed91d500b5b1784e0c438397)
#1 0x7f5bf3472c50 in g_malloc0 debian/build/deb/../../../glib/gmem.c:161:13
#2 0x555c155bb134 in bitmap_new
/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/include/qemu/bitmap.h:102:12
#3 0x555c155ba4ee in dirty_memory_extend system/physmem.c:1831:37
#4 0x555c15585a2b in ram_block_add system/physmem.c:1907:9
#5 0x555c15589e50 in qemu_ram_alloc_internal system/physmem.c:2109:5
#6 0x555c1558a096 in qemu_ram_alloc system/physmem.c:2129:12
#7 0x555c15518b69 in memory_region_init_ram_flags_nomigrate
system/memory.c:1571:21
#8 0x555c1464fd27 in ram_backend_memory_alloc backends/hostmem-ram.c:34:12
#9 0x555c146510ac in host_memory_backend_memory_complete
backends/hostmem.c:345:10
#10 0x555c1580bc90 in user_creatable_complete qom/object_interfaces.c:28:9
#11 0x555c1580c6f8 in user_creatable_add_type qom/object_interfaces.c:125:10
#12 0x555c1580ccc4 in user_creatable_add_qapi qom/object_interfaces.c:157:11
#13 0x555c15ff0e2c in qmp_object_add qom/qom-qmp-cmds.c:227:5
#14 0x555c161ce508 in qmp_marshal_object_add
/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qapi/qapi-commands-qom.c:337:5
#15 0x555c162a7139 in do_qmp_dispatch_bh qapi/qmp-dispatch.c:128:5
#16 0x555c16387921 in aio_bh_call util/async.c:171:5
#17 0x555c163887fc in aio_bh_poll util/async.c:218:13
#18 0x555c162e1288 in aio_dispatch util/aio-posix.c:423:5
#19 0x555c1638f7be in aio_ctx_dispatch util/async.c:360:5
#20 0x7f5bf3469d3a in g_main_dispatch
debian/build/deb/../../../glib/gmain.c:3419:28
#21 0x7f5bf3469d3a in g_main_context_dispatch
debian/build/deb/../../../glib/gmain.c:4137:7
#22 0x555c163935c9 in glib_pollfds_poll util/main-loop.c:287:9
#23 0x555c16391f03 in os_host_main_loop_wait util/main-loop.c:310:5
#24 0x555c16391acc in main_loop_wait util/main-loop.c:589:11
#25 0x555c14614917 in qemu_main_loop system/runstate.c:801:9
#26 0x555c16008b8c in qemu_default_main system/main.c:37:14
#27 0x555c16008bd7 in main system/main.c:48:12
#28 0x7f5bf12fbd8f in __libc_start_call_main
csu/../sysdeps/nptl/libc_start_call_main.h:58:16

My initial suspicion here is that the problem is that
TYPE_MEMORY_BACKEND has a UserCreatableClass::complete method which
calls HostMemoryBackend::alloc, but there is no corresponding
"now free this" in instance_finalize. So ram_backend_memory_alloc()
calls memory_region_init_ram_flags_nomigrate(), which allocates
RAM, dirty blocks, etc, but nothing ever destroys the MR and the
memory is leaked when the TYPE_MEMORY_BACKEND object is finalized.

But there isn't a "free" method in HostMemoryBackendClass,
only an "alloc", so this looks like an API with "leaks memory"
baked into it. How is the freeing of the memory on object
deletion intended to work?

thanks
-- PMM



Re: [PATCH] MAINTAINERS: Remove myself as reviewer

2024-08-19 Thread Thomas Huth

On 19/08/2024 17.00, Beraldo Leal wrote:

Finally taking this off my to-do list. It’s been a privilege to be part
of this project, but I am no longer actively involved in reviewing
Python code here, so I believe it's best to update the list to reflect
the current maintainers.

Please, feel free to reach out if any questions arise.

Signed-off-by: Beraldo Leal 
---
  MAINTAINERS | 3 ---
  1 file changed, 3 deletions(-)


Thank you very, very much for your past reviews, Beraldo!

I'll queue the patch along with the other testing patches that I'm collecting.

 Thomas




[RFC PATCH] scripts/lsan-suppressions: Add a LeakSanitizer suppressions file

2024-08-19 Thread Peter Maydell
Add a LeakSanitizer suppressions file that documents and suppresses
known false-positive leaks in either QEMU or its dependencies.
To use it you'll need to set
  LSAN_OPTIONS="suppressions=/path/to/scripts/lsan-suppressions.txt"
when running a QEMU built with the leak-sanitizer.

The first and currently only entry is for a deliberate leak in glib's
g_set_user_dirs() that otherwise causes false positive leak reports
in the qga-ssh-test because of its use of G_TEST_OPTION_ISOLATE_DIRS:

Direct leak of 321 byte(s) in 5 object(s) allocated from:
#0 0xdd8abd1e in __interceptor_malloc 
(/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qga/qga-ssh-test+0x19cd1e)
 (BuildId: 7991a166007e8206c51bee401722a8335e7990bb)
#1 0x7fb5bc724738 in g_malloc debian/build/deb/../../../glib/gmem.c:128:13
#2 0x7fb5bc739583 in g_strdup 
debian/build/deb/../../../glib/gstrfuncs.c:361:17
#3 0x7fb5bc757a29 in set_str_if_different 
debian/build/deb/../../../glib/gutils.c:1659:21
#4 0x7fb5bc757a29 in set_str_if_different 
debian/build/deb/../../../glib/gutils.c:1647:1
#5 0x7fb5bc757a29 in g_set_user_dirs 
debian/build/deb/../../../glib/gutils.c:1743:9
#6 0x7fb5bc743d78 in test_do_isolate_dirs 
debian/build/deb/../../../glib/gtestutils.c:1486:3
#7 0x7fb5bc743d78 in test_case_run 
debian/build/deb/../../../glib/gtestutils.c:2917:16
#8 0x7fb5bc743d78 in g_test_run_suite_internal 
debian/build/deb/../../../glib/gtestutils.c:3018:16
#9 0x7fb5bc74380a in g_test_run_suite_internal 
debian/build/deb/../../../glib/gtestutils.c:3035:18
#10 0x7fb5bc74380a in g_test_run_suite_internal 
debian/build/deb/../../../glib/gtestutils.c:3035:18
#11 0x7fb5bc743fe9 in g_test_run_suite 
debian/build/deb/../../../glib/gtestutils.c:3112:13
#12 0x7fb5bc744055 in g_test_run 
debian/build/deb/../../../glib/gtestutils.c:2231:7
#13 0x7fb5bc744055 in g_test_run 
debian/build/deb/../../../glib/gtestutils.c:2218:1
#14 0xdd9293b1 in main qga/commands-posix-ssh.c:439:12
#15 0x7fb5bc3dfd8f in __libc_start_call_main 
csu/../sysdeps/nptl/libc_start_call_main.h:58:16
#16 0x7fb5bc3dfe3f in __libc_start_main csu/../csu/libc-start.c:392:3
#17 0xdd828ed4 in _start 
(/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qga/qga-ssh-test+0x119ed4)
 (BuildId: 7991a166007e8206c51bee401722a8335e7990bb)

(Strictly speaking, this is a genuine leak, it's just a deliberate
one by glib; they document it in their valgrind-format suppression
file upstream.)

Signed-off-by: Peter Maydell 
---
Does this seem like a good idea?  It gives us a place to document
things like this and to suppress them so we could in theory get a
complete clean 'make check' run with the leak sanitizer on.  It might
be nice if there was an easy way to enable all our "recommended
sanitizer settings" (ASAN_OPTIONS="fast_unwind_on_malloc=0 is
pretty much required to get useful backtraces, for instance), but
I'm not sure there's a neat way to do that.

 scripts/lsan-suppressions.txt | 14 ++
 1 file changed, 14 insertions(+)
 create mode 100644 scripts/lsan-suppressions.txt

diff --git a/scripts/lsan-suppressions.txt b/scripts/lsan-suppressions.txt
new file mode 100644
index 000..5c3cffaa5a0
--- /dev/null
+++ b/scripts/lsan-suppressions.txt
@@ -0,0 +1,14 @@
+# SPDX-License-Identifier: GPL-2.0-or-later
+# Copyright (c) 2024 Linaro Limited
+
+# This is a set of suppressions for LeakSanitizer; you can use it
+# by setting
+#   LSAN_OPTIONS="suppressions=/path/to/scripts/lsan-suppressions.txt"
+# when running a QEMU built with the leak-sanitizer.
+
+# g_set_user_dirs() deliberately leaks the previous cached g_get_user_*
+# values. This is documented in upstream glib's valgrind-format
+# suppression file:
+# https://github.com/GNOME/glib/blob/main/tools/glib.supp
+# This avoids false positive leak reports for the qga-ssh-test.
+leak:g_set_user_dirs
-- 
2.34.1




Re: [RFC PATCH] scripts/lsan-suppressions: Add a LeakSanitizer suppressions file

2024-08-19 Thread Peter Maydell
On Mon, 19 Aug 2024 at 18:07, Peter Maydell  wrote:
>
> Add a LeakSanitizer suppressions file that documents and suppresses
> known false-positive leaks in either QEMU or its dependencies.
> To use it you'll need to set
>   LSAN_OPTIONS="suppressions=/path/to/scripts/lsan-suppressions.txt"
> when running a QEMU built with the leak-sanitizer.
>
> The first and currently only entry is for a deliberate leak in glib's
> g_set_user_dirs() that otherwise causes false positive leak reports
> in the qga-ssh-test because of its use of G_TEST_OPTION_ISOLATE_DIRS:
>
> Direct leak of 321 byte(s) in 5 object(s) allocated from:
> #0 0xdd8abd1e in __interceptor_malloc 
> (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qga/qga-ssh-test+0x19cd1e)
>  (BuildId: 7991a166007e8206c51bee401722a8335e7990bb)
> #1 0x7fb5bc724738 in g_malloc debian/build/deb/../../../glib/gmem.c:128:13
> #2 0x7fb5bc739583 in g_strdup 
> debian/build/deb/../../../glib/gstrfuncs.c:361:17
> #3 0x7fb5bc757a29 in set_str_if_different 
> debian/build/deb/../../../glib/gutils.c:1659:21
> #4 0x7fb5bc757a29 in set_str_if_different 
> debian/build/deb/../../../glib/gutils.c:1647:1
> #5 0x7fb5bc757a29 in g_set_user_dirs 
> debian/build/deb/../../../glib/gutils.c:1743:9
> #6 0x7fb5bc743d78 in test_do_isolate_dirs 
> debian/build/deb/../../../glib/gtestutils.c:1486:3
> #7 0x7fb5bc743d78 in test_case_run 
> debian/build/deb/../../../glib/gtestutils.c:2917:16
> #8 0x7fb5bc743d78 in g_test_run_suite_internal 
> debian/build/deb/../../../glib/gtestutils.c:3018:16
> #9 0x7fb5bc74380a in g_test_run_suite_internal 
> debian/build/deb/../../../glib/gtestutils.c:3035:18
> #10 0x7fb5bc74380a in g_test_run_suite_internal 
> debian/build/deb/../../../glib/gtestutils.c:3035:18
> #11 0x7fb5bc743fe9 in g_test_run_suite 
> debian/build/deb/../../../glib/gtestutils.c:3112:13
> #12 0x7fb5bc744055 in g_test_run 
> debian/build/deb/../../../glib/gtestutils.c:2231:7
> #13 0x7fb5bc744055 in g_test_run 
> debian/build/deb/../../../glib/gtestutils.c:2218:1
> #14 0xdd9293b1 in main qga/commands-posix-ssh.c:439:12
> #15 0x7fb5bc3dfd8f in __libc_start_call_main 
> csu/../sysdeps/nptl/libc_start_call_main.h:58:16
> #16 0x7fb5bc3dfe3f in __libc_start_main csu/../csu/libc-start.c:392:3
> #17 0xdd828ed4 in _start 
> (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qga/qga-ssh-test+0x119ed4)
>  (BuildId: 7991a166007e8206c51bee401722a8335e7990bb)
>
> (Strictly speaking, this is a genuine leak, it's just a deliberate
> one by glib; they document it in their valgrind-format suppression
> file upstream.)
>
> Signed-off-by: Peter Maydell 
> ---
> Does this seem like a good idea?  It gives us a place to document
> things like this and to suppress them so we could in theory get a
> complete clean 'make check' run with the leak sanitizer on.  It might
> be nice if there was an easy way to enable all our "recommended
> sanitizer settings" (ASAN_OPTIONS="fast_unwind_on_malloc=0 is
> pretty much required to get useful backtraces, for instance), but
> I'm not sure there's a neat way to do that.

On the "no" end of the argument: it looks like from glib 2.79
or thereabouts there was support added to glib to make it
dynamically detect whether it's running in a binary that was
built with LSan and explicitly tell lsan to ignore these
deliberate leaks. That fix is less than a year old, though,
and at least my dev machine is still running 2.72.

https://github.com/GNOME/glib/commit/fb58d55187dfe1565d10c0c0ffdbaa85376cf0b8

-- PMM



Re: apparent memory leak from object-add+object-del of memory-backend-ram

2024-08-19 Thread David Hildenbrand

On 19.08.24 18:24, Peter Maydell wrote:

Hi; I'm looking at a memory leak apparently in the host memory backend
code that you can see from the qmp-cmd-test. Repro instructions:


Hi Peter,



(1) build QEMU with '--cc=clang' '--cxx=clang++' '--enable-debug'
'--target-list=x86_64-softmmu' '--enable-sanitizers'
(2) run 'make check'. More specifically, to get just this
failure ('make check' on current head-of-tree produces some
other unrelated leak errors) you can run the relevant single test:

(cd build/asan && ASAN_OPTIONS="fast_unwind_on_malloc=0"
QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/qmp-cmd-test
--tap -k -p /x86_64/qmp/object-add-failure-modes)

The test case is doing a variety of object-add then object-del
of the "memory-backend-ram" object, and this add-del cycle seems
to result in a fairly large leak:

Direct leak of 1572864 byte(s) in 6 object(s) allocated from:
 #0 0x555c1336efd8 in __interceptor_calloc
(/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qemu-system-x86_64+0x218efd8)
(BuildId: fc7566a39db1253aed91d500b5b1784e0c438397)
 #1 0x7f5bf3472c50 in g_malloc0 debian/build/deb/../../../glib/gmem.c:161:13
 #2 0x555c155bb134 in bitmap_new
/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/include/qemu/bitmap.h:102:12
 #3 0x555c155ba4ee in dirty_memory_extend system/physmem.c:1831:37
 #4 0x555c15585a2b in ram_block_add system/physmem.c:1907:9
 #5 0x555c15589e50 in qemu_ram_alloc_internal system/physmem.c:2109:5
 #6 0x555c1558a096 in qemu_ram_alloc system/physmem.c:2129:12
 #7 0x555c15518b69 in memory_region_init_ram_flags_nomigrate
system/memory.c:1571:21
 #8 0x555c1464fd27 in ram_backend_memory_alloc backends/hostmem-ram.c:34:12
 #9 0x555c146510ac in host_memory_backend_memory_complete
backends/hostmem.c:345:10
 #10 0x555c1580bc90 in user_creatable_complete qom/object_interfaces.c:28:9
 #11 0x555c1580c6f8 in user_creatable_add_type 
qom/object_interfaces.c:125:10
 #12 0x555c1580ccc4 in user_creatable_add_qapi 
qom/object_interfaces.c:157:11
 #13 0x555c15ff0e2c in qmp_object_add qom/qom-qmp-cmds.c:227:5
 #14 0x555c161ce508 in qmp_marshal_object_add
/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qapi/qapi-commands-qom.c:337:5
 #15 0x555c162a7139 in do_qmp_dispatch_bh qapi/qmp-dispatch.c:128:5
 #16 0x555c16387921 in aio_bh_call util/async.c:171:5
 #17 0x555c163887fc in aio_bh_poll util/async.c:218:13
 #18 0x555c162e1288 in aio_dispatch util/aio-posix.c:423:5
 #19 0x555c1638f7be in aio_ctx_dispatch util/async.c:360:5
 #20 0x7f5bf3469d3a in g_main_dispatch
debian/build/deb/../../../glib/gmain.c:3419:28
 #21 0x7f5bf3469d3a in g_main_context_dispatch
debian/build/deb/../../../glib/gmain.c:4137:7
 #22 0x555c163935c9 in glib_pollfds_poll util/main-loop.c:287:9
 #23 0x555c16391f03 in os_host_main_loop_wait util/main-loop.c:310:5
 #24 0x555c16391acc in main_loop_wait util/main-loop.c:589:11
 #25 0x555c14614917 in qemu_main_loop system/runstate.c:801:9
 #26 0x555c16008b8c in qemu_default_main system/main.c:37:14
 #27 0x555c16008bd7 in main system/main.c:48:12
 #28 0x7f5bf12fbd8f in __libc_start_call_main
csu/../sysdeps/nptl/libc_start_call_main.h:58:16

My initial suspicion here is that the problem is that
TYPE_MEMORY_BACKEND has a UserCreatableClass::complete method which
calls HostMemoryBackend::alloc, but there is no corresponding
"now free this" in instance_finalize. So ram_backend_memory_alloc()
calls memory_region_init_ram_flags_nomigrate(), which allocates
RAM, dirty blocks, etc, but nothing ever destroys the MR and the
memory is leaked when the TYPE_MEMORY_BACKEND object is finalized.

But there isn't a "free" method in HostMemoryBackendClass,
only an "alloc", so this looks like an API with "leaks memory"
baked into it. How is the freeing of the memory on object
deletion intended to work?


I *think* during object_del(), we would be un-refing the contained 
memory-region, which in turn will make the refcount go to 0 and end up 
calling memory_region_finalize().


In memory_region_finalize, we do various things, including calling 
mr->destructor(mr).


For memory_region_init_ram_flags_nomigrate(), the deconstructor is set 
to memory_region_destructor_ram(). This is the place where we call 
qemu_ram_free(mr->ram_block);


There we clean up.

What we *don't* clean up is the allocation you are seeing: 
dirty_memory_extend() will extend the ram_list.dirty_memory bitmap as 
needed. It is not stored in the RAMBlock, it's a global list.


It's not really a leak I think: when we object_del + object_add *I 
think* that bitmap will simply get reused.


I think at some point I had a dirty_memory_shrink() implementation here, 
but I never upstreamed it, because it was not really worth the churn.


--
Cheers,

David / dhildenb




Re: [RFC PATCH] scripts/lsan-suppressions: Add a LeakSanitizer suppressions file

2024-08-19 Thread Alex Bennée
Peter Maydell  writes:

> Add a LeakSanitizer suppressions file that documents and suppresses
> known false-positive leaks in either QEMU or its dependencies.
> To use it you'll need to set
>   LSAN_OPTIONS="suppressions=/path/to/scripts/lsan-suppressions.txt"
> when running a QEMU built with the leak-sanitizer.
>
> The first and currently only entry is for a deliberate leak in glib's
> g_set_user_dirs() that otherwise causes false positive leak reports
> in the qga-ssh-test because of its use of G_TEST_OPTION_ISOLATE_DIRS:

Shame we can't share with scripts/oss-fuzz/lsan_supressions.tct:

# The tcmalloc on Fedora37 confuses things
leak:/lib64/libtcmalloc_minimal.so.4

# libxkbcommon also leaks in qemu-keymap
leak:/lib64/libxkbcommon.so.0

Or does fuzzing make some things easier to hit?

>
> Direct leak of 321 byte(s) in 5 object(s) allocated from:
> #0 0xdd8abd1e in __interceptor_malloc 
> (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qga/qga-ssh-test+0x19cd1e)
>  (BuildId: 7991a166007e8206c51bee401722a8335e7990bb)
> #1 0x7fb5bc724738 in g_malloc debian/build/deb/../../../glib/gmem.c:128:13
> #2 0x7fb5bc739583 in g_strdup 
> debian/build/deb/../../../glib/gstrfuncs.c:361:17
> #3 0x7fb5bc757a29 in set_str_if_different 
> debian/build/deb/../../../glib/gutils.c:1659:21
> #4 0x7fb5bc757a29 in set_str_if_different 
> debian/build/deb/../../../glib/gutils.c:1647:1
> #5 0x7fb5bc757a29 in g_set_user_dirs 
> debian/build/deb/../../../glib/gutils.c:1743:9
> #6 0x7fb5bc743d78 in test_do_isolate_dirs 
> debian/build/deb/../../../glib/gtestutils.c:1486:3
> #7 0x7fb5bc743d78 in test_case_run 
> debian/build/deb/../../../glib/gtestutils.c:2917:16
> #8 0x7fb5bc743d78 in g_test_run_suite_internal 
> debian/build/deb/../../../glib/gtestutils.c:3018:16
> #9 0x7fb5bc74380a in g_test_run_suite_internal 
> debian/build/deb/../../../glib/gtestutils.c:3035:18
> #10 0x7fb5bc74380a in g_test_run_suite_internal 
> debian/build/deb/../../../glib/gtestutils.c:3035:18
> #11 0x7fb5bc743fe9 in g_test_run_suite 
> debian/build/deb/../../../glib/gtestutils.c:3112:13
> #12 0x7fb5bc744055 in g_test_run 
> debian/build/deb/../../../glib/gtestutils.c:2231:7
> #13 0x7fb5bc744055 in g_test_run 
> debian/build/deb/../../../glib/gtestutils.c:2218:1
> #14 0xdd9293b1 in main qga/commands-posix-ssh.c:439:12
> #15 0x7fb5bc3dfd8f in __libc_start_call_main 
> csu/../sysdeps/nptl/libc_start_call_main.h:58:16
> #16 0x7fb5bc3dfe3f in __libc_start_main csu/../csu/libc-start.c:392:3
> #17 0xdd828ed4 in _start 
> (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qga/qga-ssh-test+0x119ed4)
>  (BuildId: 7991a166007e8206c51bee401722a8335e7990bb)
>
> (Strictly speaking, this is a genuine leak, it's just a deliberate
> one by glib; they document it in their valgrind-format suppression
> file upstream.)
>
> Signed-off-by: Peter Maydell 
> ---
> Does this seem like a good idea?  It gives us a place to document
> things like this and to suppress them so we could in theory get a
> complete clean 'make check' run with the leak sanitizer on.  It might
> be nice if there was an easy way to enable all our "recommended
> sanitizer settings" (ASAN_OPTIONS="fast_unwind_on_malloc=0 is
> pretty much required to get useful backtraces, for instance), but
> I'm not sure there's a neat way to do that.
>
>  scripts/lsan-suppressions.txt | 14 ++
>  1 file changed, 14 insertions(+)
>  create mode 100644 scripts/lsan-suppressions.txt
>
> diff --git a/scripts/lsan-suppressions.txt b/scripts/lsan-suppressions.txt
> new file mode 100644
> index 000..5c3cffaa5a0
> --- /dev/null
> +++ b/scripts/lsan-suppressions.txt
> @@ -0,0 +1,14 @@
> +# SPDX-License-Identifier: GPL-2.0-or-later
> +# Copyright (c) 2024 Linaro Limited
> +
> +# This is a set of suppressions for LeakSanitizer; you can use it
> +# by setting
> +#   LSAN_OPTIONS="suppressions=/path/to/scripts/lsan-suppressions.txt"
> +# when running a QEMU built with the leak-sanitizer.
> +
> +# g_set_user_dirs() deliberately leaks the previous cached g_get_user_*
> +# values. This is documented in upstream glib's valgrind-format
> +# suppression file:
> +# https://github.com/GNOME/glib/blob/main/tools/glib.supp
> +# This avoids false positive leak reports for the qga-ssh-test.
> +leak:g_set_user_dirs

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro



Re: [PATCH] MAINTAINERS: Remove myself as reviewer

2024-08-19 Thread Philippe Mathieu-Daudé

On 19/8/24 17:00, Beraldo Leal wrote:

Finally taking this off my to-do list. It’s been a privilege to be part
of this project, but I am no longer actively involved in reviewing
Python code here, so I believe it's best to update the list to reflect
the current maintainers.

Please, feel free to reach out if any questions arise.

Signed-off-by: Beraldo Leal 
---
  MAINTAINERS | 3 ---
  1 file changed, 3 deletions(-)


Thanks Beraldo for your previous reviews!

Reviewed-by: Philippe Mathieu-Daudé 




Re: [PULL 3/5] tests/avocado: apply proper skipUnless decorator

2024-08-19 Thread Philippe Mathieu-Daudé

On 16/8/24 09:22, Thomas Huth wrote:

From: Cleber Rosa 

Commit 9b45cc993 added many cases of skipUnless for the sake of
organizing flaky tests.  But, Python decorators *must* follow what
they decorate, so the newlines added should *not* exist there.

Signed-off-by: Cleber Rosa 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Thomas Huth 
Tested-by: Marcin Juszkiewicz 
Message-ID: <20240806173119.582857-3-cr...@redhat.com>
Signed-off-by: Thomas Huth 
---
  tests/avocado/boot_linux_console.py | 1 -
  tests/avocado/intel_iommu.py| 1 -
  tests/avocado/linux_initrd.py   | 1 -
  tests/avocado/machine_aspeed.py | 2 --
  tests/avocado/machine_mips_malta.py | 2 --
  tests/avocado/machine_rx_gdbsim.py  | 1 -
  tests/avocado/reverse_debugging.py  | 4 
  tests/avocado/smmu.py   | 1 -
  8 files changed, 13 deletions(-)




diff --git a/tests/avocado/machine_rx_gdbsim.py 
b/tests/avocado/machine_rx_gdbsim.py
index 412a7a5089..9a0bec8a6e 100644
--- a/tests/avocado/machine_rx_gdbsim.py
+++ b/tests/avocado/machine_rx_gdbsim.py
@@ -49,7 +49,6 @@ def test_uboot(self):
  #exec_command_and_wait_for_pattern(self, 'version', gcc_version)
  
  @skipUnless(os.getenv('QEMU_TEST_FLAKY_TESTS'), 'Test is unstable on GitLab')

-
  def test_linux_sash(self):
  """
  Boots a Linux kernel and checks that the console is operational.


For some weird reason a part of this patch is missing...



[PULL 00/20] Misc fixes for 2024-08-20

2024-08-19 Thread Philippe Mathieu-Daudé
The following changes since commit ecdfa31beb1f7616091bedba79dfdf9ee525ed9d:

  Merge tag 'pull-request-2024-08-16' of https://gitlab.com/thuth/qemu into 
staging (2024-08-16 18:18:27 +1000)

are available in the Git repository at:

  https://github.com/philmd/qemu.git tags/hw-misc-20240820

for you to fetch changes up to 87e012f29f2e47dcd8c385ff8bb8188f9e06d4ea:

  crypto/tlscredspsk: Free username on finalize (2024-08-20 00:49:14 +0200)

Ignored checkpatch warning:

  WARNING: line over 80 characters
  #115: FILE: target/mips/tcg/sysemu/tlb_helper.c:713:
  +MemOp native_op = (((env->CP0_PWSize >> CP0PS_PS) & 1) == 0) ? MO_32 : 
MO_64;


Various fixes

- Null pointer dereference in IPI IOCSR (Jiaxun)
- Correct '-smbios type=4' in man page (Heinrich)
- Use correct MMU index in MIPS get_pte (Phil)
- Reset MPQEMU remote message using device_cold_reset (Peter)
- Update linux-user MIPS CPU list (Phil)
- Do not let exec_command read console if no pattern to wait for (Nick)
- Remove shadowed declaration warning (Pierrick)
- Restrict STQF opcode to SPARC V9 (Richard)
- Add missing Kconfig dependency for POWERNV ISA serial port (Bernhard)
- Do not allow vmport device without i8042 PS/2 controller (Kamil)
- Fix QCryptoTLSCredsPSK leak (Peter)



Bernhard Beschow (1):
  hw/ppc/Kconfig: Add missing SERIAL_ISA dependency to POWERNV machine

Heinrich Schuchardt (1):
  qemu-options.hx: correct formatting -smbios type=4

Jiaxun Yang (2):
  hw/mips/loongson3_virt: Store core_iocsr into LoongsonMachineState
  hw/mips/loongson3_virt: Fix condition of IPI IOCSR connection

Kamil Szczęk (2):
  hw/i386/pc: Unify vmport=auto handling
  hw/i386/pc: Ensure vmport prerequisites are fulfilled

Nicholas Piggin (2):
  tests/avocado: exec_command should not consume console output
  tests/avocado: Mark ppc_hv_tests.py as non-flaky after fixed console
interaction

Peter Maydell (3):
  hw/dma/xilinx_axidma: Use semicolon at end of statement, not comma
  hw/remote/message.c: Don't directly invoke DeviceClass:reset
  crypto/tlscredspsk: Free username on finalize

Philippe Mathieu-Daudé (7):
  target/mips: Pass page table entry size as MemOp to get_pte()
  target/mips: Use correct MMU index in get_pte()
  target/mips: Load PTE as DATA
  linux-user/mips: Do not try to use removed R5900 CPU
  linux-user/mips: Select Octeon68XX CPU for Octeon binaries
  linux-user/mips: Select MIPS64R2-generic for Rel2 binaries
  linux-user/mips: Select Loongson CPU for Loongson binaries

Pierrick Bouvier (1):
  contrib/plugins/execlog: Fix shadowed declaration warning

Richard Henderson (1):
  target/sparc: Restrict STQF to sparcv9

 linux-user/mips/target_elf.h   |  3 --
 linux-user/mips64/target_elf.h | 24 +++--
 target/sparc/insns.decode  |  2 +-
 contrib/plugins/execlog.c  |  4 +-
 crypto/tlscredspsk.c   |  1 +
 hw/dma/xilinx_axidma.c |  2 +-
 hw/i386/pc.c   | 14 +-
 hw/i386/pc_piix.c  |  5 --
 hw/i386/pc_q35.c   |  5 --
 hw/mips/loongson3_virt.c   |  5 +-
 hw/remote/message.c|  5 +-
 target/mips/tcg/sysemu/tlb_helper.c| 69 +-
 target/sparc/translate.c   |  2 +-
 hw/ppc/Kconfig |  1 +
 qemu-options.hx|  6 +--
 tests/avocado/avocado_qemu/__init__.py |  7 +++
 tests/avocado/ppc_hv_tests.py  |  1 -
 17 files changed, 89 insertions(+), 67 deletions(-)

-- 
2.45.2




[PULL 01/20] hw/mips/loongson3_virt: Store core_iocsr into LoongsonMachineState

2024-08-19 Thread Philippe Mathieu-Daudé
From: Jiaxun Yang 

Link: 
https://lore.kernel.org/qemu-devel/972034d6-23b3-415a-b401-b8bc1cc51...@linaro.org/
Suggested-by: Philippe Mathieu-Daudé 
Signed-off-by: Jiaxun Yang 
Reviewed-by: Philippe Mathieu-Daudé 
Message-ID: <20240621-loongson3-ipi-follow-v2-1-848eafcbb...@flygoat.com>
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/mips/loongson3_virt.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/mips/loongson3_virt.c b/hw/mips/loongson3_virt.c
index 408e3d7054..27a85e3614 100644
--- a/hw/mips/loongson3_virt.c
+++ b/hw/mips/loongson3_virt.c
@@ -97,6 +97,7 @@ struct LoongsonMachineState {
 MemoryRegion *pio_alias;
 MemoryRegion *mmio_alias;
 MemoryRegion *ecam_alias;
+MemoryRegion *core_iocsr[LOONGSON_MAX_VCPUS];
 };
 typedef struct LoongsonMachineState LoongsonMachineState;
 
@@ -493,6 +494,7 @@ static void mips_loongson3_virt_init(MachineState *machine)
 const char *kernel_filename = machine->kernel_filename;
 const char *initrd_filename = machine->initrd_filename;
 ram_addr_t ram_size = machine->ram_size;
+LoongsonMachineState *s = LOONGSON_MACHINE(machine);
 MemoryRegion *address_space_mem = get_system_memory();
 MemoryRegion *ram = g_new(MemoryRegion, 1);
 MemoryRegion *bios = g_new(MemoryRegion, 1);
@@ -586,6 +588,7 @@ static void mips_loongson3_virt_init(MachineState *machine)
  iocsr, 0, UINT32_MAX);
 memory_region_add_subregion(&MIPS_CPU(cpu)->env.iocsr.mr,
 0, core_iocsr);
+s->core_iocsr[i] = core_iocsr;
 }
 
 if (node > 0) {
-- 
2.45.2




[PULL 03/20] qemu-options.hx: correct formatting -smbios type=4

2024-08-19 Thread Philippe Mathieu-Daudé
From: Heinrich Schuchardt 

processor-family and processor-id can be assigned independently.

Add missing brackets.

Fixes: b5831d79671c ("smbios: add processor-family option")
Signed-off-by: Heinrich Schuchardt 
Reviewed-by: Thomas Huth 
Reviewed-by: Philippe Mathieu-Daudé 
Message-ID: <20240729204816.11905-1-heinrich.schucha...@canonical.com>
Signed-off-by: Philippe Mathieu-Daudé 
---
 qemu-options.hx | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index cee0da2014..d99084a5ee 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -2704,7 +2704,7 @@ DEF("smbios", HAS_ARG, QEMU_OPTION_smbios,
 "specify SMBIOS type 3 fields\n"
 "-smbios 
type=4[,sock_pfx=str][,manufacturer=str][,version=str][,serial=str]\n"
 "  [,asset=str][,part=str][,max-speed=%d][,current-speed=%d]\n"
-"  [,processor-family=%d,processor-id=%d]\n"
+"  [,processor-family=%d][,processor-id=%d]\n"
 "specify SMBIOS type 4 fields\n"
 "-smbios 
type=8[,external_reference=str][,internal_reference=str][,connector_type=%d][,port_type=%d]\n"
 "specify SMBIOS type 8 fields\n"
-- 
2.45.2




[PULL 02/20] hw/mips/loongson3_virt: Fix condition of IPI IOCSR connection

2024-08-19 Thread Philippe Mathieu-Daudé
From: Jiaxun Yang 

>>> CID 1547264:  Null pointer dereferences  (REVERSE_INULL)
>>> Null-checking "ipi" suggests that it may be null, but it has already 
>>> been dereferenced on all paths leading to the check.

Resolves: Coverity CID 1547264
Link: 
https://lore.kernel.org/qemu-devel/752417ad-ab72-4fed-8d1f-af41f15bc...@app.fastmail.com/
Signed-off-by: Jiaxun Yang 
Reviewed-by: Philippe Mathieu-Daudé 
Message-ID: <20240621-loongson3-ipi-follow-v2-2-848eafcbb...@flygoat.com>
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/mips/loongson3_virt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/mips/loongson3_virt.c b/hw/mips/loongson3_virt.c
index 27a85e3614..2067b4fecb 100644
--- a/hw/mips/loongson3_virt.c
+++ b/hw/mips/loongson3_virt.c
@@ -574,7 +574,7 @@ static void mips_loongson3_virt_init(MachineState *machine)
 cpu_mips_clock_init(cpu);
 qemu_register_reset(main_cpu_reset, cpu);
 
-if (ipi) {
+if (!kvm_enabled()) {
 hwaddr base = ((hwaddr)node << 44) + virt_memmap[VIRT_IPI].base;
 base += core * 0x100;
 qdev_connect_gpio_out(ipi, i, cpu->env.irq[6]);
-- 
2.45.2




[PULL 07/20] hw/dma/xilinx_axidma: Use semicolon at end of statement, not comma

2024-08-19 Thread Philippe Mathieu-Daudé
From: Peter Maydell 

In axidma_class_init() we accidentally used a comma at the end of
a statement rather than a semicolon. This has no ill effects, but
it's obviously not intended and it means that Coccinelle scripts
for instance will fail to match on the two statements. Use a
semicolon instead.

Signed-off-by: Peter Maydell 
Reviewed-by: Richard Henderson 
Reviewed-by: Thomas Huth 
Message-ID: <20240813165250.2717650-6-peter.mayd...@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/dma/xilinx_axidma.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/dma/xilinx_axidma.c b/hw/dma/xilinx_axidma.c
index c9cfc3169b..7707634253 100644
--- a/hw/dma/xilinx_axidma.c
+++ b/hw/dma/xilinx_axidma.c
@@ -626,7 +626,7 @@ static void axidma_class_init(ObjectClass *klass, void 
*data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
 
-dc->realize = xilinx_axidma_realize,
+dc->realize = xilinx_axidma_realize;
 dc->reset = xilinx_axidma_reset;
 device_class_set_props(dc, axidma_properties);
 }
-- 
2.45.2




[PULL 08/20] hw/remote/message.c: Don't directly invoke DeviceClass:reset

2024-08-19 Thread Philippe Mathieu-Daudé
From: Peter Maydell 

Directly invoking the DeviceClass::reset method is a bad idea,
because if the device is using three-phase reset then it relies on
transitional reset machinery which is likely to disappear at some
point.

Reset the device in the standard way, by calling device_cold_reset().

Signed-off-by: Peter Maydell 
Reviewed-by: Richard Henderson 
Reviewed-by: Philippe Mathieu-Daudé 
Message-ID: <20240813165250.2717650-7-peter.mayd...@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/remote/message.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/hw/remote/message.c b/hw/remote/message.c
index 50f6bf2d49..38ae6c75b4 100644
--- a/hw/remote/message.c
+++ b/hw/remote/message.c
@@ -215,13 +215,10 @@ fail:
 static void process_device_reset_msg(QIOChannel *ioc, PCIDevice *dev,
  Error **errp)
 {
-DeviceClass *dc = DEVICE_GET_CLASS(dev);
 DeviceState *s = DEVICE(dev);
 MPQemuMsg ret = { 0 };
 
-if (dc->reset) {
-dc->reset(s);
-}
+device_cold_reset(s);
 
 ret.cmd = MPQEMU_CMD_RET;
 
-- 
2.45.2




  1   2   >