On 20.02.25 17:13, Jean-Philippe Brucker wrote:
For Arm CCA we'd like the guest_memfd discard notifier to call the IOMMU
notifiers and create e.g. VFIO mappings. The default VFIO discard
notifier isn't sufficient for CCA because the DMA addresses need a
translation (even without vIOMMU).
At the moment:
* guest_memfd_state_change() calls the populate() notifier
* the populate notifier() calls IOMMU notifiers
* the IOMMU notifier handler calls memory_get_xlat_addr() to get a VA
* it calls ram_discard_manager_is_populated() which fails.
guest_memfd_state_change() only changes the section's state after
calling the populate() notifier. We can't easily invert the order of
operation because it uses the old state bitmap to know which pages need
the populate() notifier.
I assume we talk about this code: [1]
[1] https://lkml.kernel.org/r/20250217081833.21568-1-chenyi.qi...@intel.com
+static int memory_attribute_state_change(MemoryAttributeManager *mgr, uint64_t
offset,
+ uint64_t size, bool shared_to_private)
+{
+ int block_size = memory_attribute_manager_get_block_size(mgr);
+ int ret = 0;
+
+ if (!memory_attribute_is_valid_range(mgr, offset, size)) {
+ error_report("%s, invalid range: offset 0x%lx, size 0x%lx",
+ __func__, offset, size);
+ return -1;
+ }
+
+ if ((shared_to_private && memory_attribute_is_range_discarded(mgr, offset,
size)) ||
+ (!shared_to_private && memory_attribute_is_range_populated(mgr,
offset, size))) {
+ return 0;
+ }
+
+ if (shared_to_private) {
+ memory_attribute_notify_discard(mgr, offset, size);
+ } else {
+ ret = memory_attribute_notify_populate(mgr, offset, size);
+ }
+
+ if (!ret) {
+ unsigned long first_bit = offset / block_size;
+ unsigned long nbits = size / block_size;
+
+ g_assert((first_bit + nbits) <= mgr->bitmap_size);
+
+ if (shared_to_private) {
+ bitmap_clear(mgr->shared_bitmap, first_bit, nbits);
+ } else {
+ bitmap_set(mgr->shared_bitmap, first_bit, nbits);
+ }
+
+ return 0;
+ }
+
+ return ret;
+}
Then, in memory_attribute_notify_populate(), we walk the bitmap again.
Why?
We just checked that it's all in the expected state, no?
virtio-mem doesn't handle it that way, so I'm curious why we would have to do
it here?
For now add a flag to the IOMMU notifier to tell memory_get_xlat_addr()
that we're aware of the RAM discard manager state.
Signed-off-by: Jean-Philippe Brucker <jean-phili...@linaro.org>
---
Definitely not the prettiest hack, any idea how to do this cleanly?
---
include/exec/memory.h | 5 +++++
system/memory.c | 3 ++-
2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 9f73b59867..6fcd98fe58 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -116,6 +116,11 @@ typedef enum {
IOMMU_RO = 1,
IOMMU_WO = 2,
IOMMU_RW = 3,
+ /*
+ * Allow mapping a discarded page, because we're in the process of
+ * populating it.
+ */
+ IOMMU_POPULATING = 4,
} IOMMUAccessFlags;
#define IOMMU_ACCESS_FLAG(r, w) (((r) ? IOMMU_RO : 0) | ((w) ? IOMMU_WO : 0))
diff --git a/system/memory.c b/system/memory.c
index 4c829793a0..8e884f9c15 100644
--- a/system/memory.c
+++ b/system/memory.c
@@ -2221,7 +2221,8 @@ bool memory_get_xlat_addr(IOMMUTLBEntry *iotlb, void
**vaddr,
* Disallow that. vmstate priorities make sure any RamDiscardManager
* were already restored before IOMMUs are restored.
*/
- if (!ram_discard_manager_is_populated(rdm, &tmp)) {
+ if (!(iotlb->perm & IOMMU_POPULATING) &&
+ !ram_discard_manager_is_populated(rdm, &tmp)) {
error_setg(errp, "iommu map to discarded memory (e.g., unplugged"
" via virtio-mem): %" HWADDR_PRIx "",
iotlb->translated_addr);
--
Cheers,
David / dhildenb