Hi Sergej,
On 30/04/17 20:48, Sergej Proskurin wrote:
The function p2m_mem_access_check_and_get_page in mem_access.c
translates a gva to an ipa by means of the hardware functionality
implemented in the function gva_to_ipa. If mem_access is active,
hardware-based gva to ipa translation might fail, as gva_to_ipa uses the
guest's translation tables, access to which might be restricted by the
active VTTBR. To address this issue, we perform the gva to ipa
translation in software.
Signed-off-by: Sergej Proskurin <prosku...@sec.in.tum.de>
---
Cc: Razvan Cojocaru <rcojoc...@bitdefender.com>
Cc: Tamas K Lengyel <ta...@tklengyel.com>
Cc: Stefano Stabellini <sstabell...@kernel.org>
Cc: Julien Grall <julien.gr...@arm.com>
---
xen/arch/arm/mem_access.c | 140 +++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 139 insertions(+), 1 deletion(-)
diff --git a/xen/arch/arm/mem_access.c b/xen/arch/arm/mem_access.c
index 04b1506b00..352eb6073f 100644
--- a/xen/arch/arm/mem_access.c
+++ b/xen/arch/arm/mem_access.c
@@ -20,6 +20,7 @@
#include <xen/monitor.h>
#include <xen/sched.h>
#include <xen/vm_event.h>
+#include <xen/domain_page.h>
#include <public/vm_event.h>
#include <asm/event.h>
@@ -90,6 +91,129 @@ static int __p2m_get_mem_access(struct domain *d, gfn_t gfn,
return 0;
}
+static int
+p2m_gva_to_ipa(struct p2m_domain *p2m, vaddr_t gva,
+ paddr_t *ipa, unsigned int flags)
I don't think this function belongs to mem_access.c. It would be better
in p2m.c. Also, the name is a bit confusing, a user would not know the
difference between p2m_gva_to_ipa and gva_to_ipa.
+{
+ int level=0, t0_sz, t1_sz;
Coding style level = 0.
Also please use unsigned int for level.
t0_sz and t1_sz should be stay signed.
+ unsigned long t0_max, t1_min;
+ lpae_t pte, *table;
+ mfn_t root_mfn;
+ uint64_t ttbr;
+ uint32_t sctlr = READ_SYSREG(SCTLR_EL1);
+ register_t ttbcr = READ_SYSREG(TCR_EL1);
So you are assuming that the current vCPU is running, right? If so, this
should be clarified in the commit message and in a comment above the
function.
Also s/ttbcr/tcr/
+ struct domain *d = p2m->domain;
+
+ const unsigned int offsets[4] = {
+#ifdef CONFIG_ARM_64
+ zeroeth_table_offset(gva),
+#endif
+ first_table_offset(gva),
+ second_table_offset(gva),
+ third_table_offset(gva)
+ };
Offsets are based on the granularity of Xen (currently 4KB). However the
guests can support 4KB, 16KB, 64KB. Please handle everything correctly.
+
+ const paddr_t masks[4] = {
+#ifdef CONFIG_ARM_64
+ ZEROETH_SIZE - 1,
+#endif
+ FIRST_SIZE - 1,
+ SECOND_SIZE - 1,
+ THIRD_SIZE - 1
+ };
+
+ /* If the MMU is disabled, there is no need to translate the gva. */
+ if ( !(sctlr & SCTLR_M) )
+ {
+ *ipa = gva;
+
+ return 0;
+ }
+
+ if ( is_32bit_domain(d) )
+ {
+ /*
+ * XXX: We do not support 32-bit domain translation table walks for
+ * domains using the short-descriptor translation table format, yet.
+ */
Debian ARM 32bit is using short-descriptor translation table. So are you
suggesting that memaccess will not correctly with Debian guest?
+ if ( !(ttbcr & TTBCR_EAE) )
See my comment on patch #2 about the naming convention.
+ return -EFAULT;
+
+#ifdef CONFIG_ARM_64
+ level = 1;
This sounds wrong to me. The first lookup level is determined by the
value of T0SZ and T1SZ (see B3-1352 in ARM DDI 0406C.c).
For the AArch64 version see D4.2.5 in ARM DDI 0487A.k_iss10775.
+#endif
+ }
+
+#ifdef CONFIG_ARM_64
+ if ( is_64bit_domain(d) )
+ {
+ /* Get the max GVA that can be translated by TTBR0. */
+ t0_sz = (ttbcr >> TCR_T0SZ_SHIFT) & TCR_SZ_MASK;
+ t0_max = (1UL << (64 - t0_sz)) - 1;
+
+ /* Get the min GVA that can be translated by TTBR1. */
+ t1_sz = (ttbcr >> TCR_T1SZ_SHIFT) & TCR_SZ_MASK;
+ t1_min = ~0UL - (1UL << (64 - t1_sz)) + 1;
+ }
+ else
+#endif
+ {
+ /* Get the max GVA that can be translated by TTBR0. */
+ t0_sz = (ttbcr >> TCR_T0SZ_SHIFT) & TTBCR_SZ_MASK;
See my comment on patch #2 for the naming convention.
+ t0_max = (1U << (32 - t0_sz)) - 1;
+
+ /* Get the min GVA that can be translated by TTBR1. */
+ t1_sz = (ttbcr >> TCR_T1SZ_SHIFT) & TTBCR_SZ_MASK;
+ t1_min = ~0U - (1U << (32 - t1_sz)) + 1;
1U << (32 - t1_sz) will not fit in an unsigned long if t1_sz = 0.
Also, this code looks wrong to me. Looking at B3.6.4 in DDI 0406C.c, you
don't handle properly t0_max and t1_min when T0SZ or T1SZ is 0.
I think it would be worth for you to have a look to the pseudo-code in
the ARM ARM for TranslationTableWalk (see B3.19.3 in ARM DDI 0406C.c and
J1-5295 in ARM DDI 0487A.k_iss10775) which gives you all the details for
doing properly translation table walk.
+ }
+
+ if ( t0_max >= gva )
+ /* Use TTBR0 for GVA to IPA translation. */
+ ttbr = READ_SYSREG64(TTBR0_EL1);
+ else if ( t1_min <= gva )
+ /* Use TTBR1 for GVA to IPA translation. */
+ ttbr = READ_SYSREG64(TTBR1_EL1);
+ else
+ /* GVA out of bounds of TTBR(0|1). */
+ return -EFAULT;
+
+ /* Bits [63..48] might be used by an ASID. */
+ root_mfn = p2m_lookup(d, _gfn(paddr_to_pfn(ttbr & ((1ULL<<48)-1))), NULL);
Please don't hardcode the mask.
+
+ /* Check, whether TTBR holds a valid address. */
+ if ( mfn_eq(root_mfn, INVALID_MFN) )
+ return -EFAULT;
+
+ table = map_domain_page(root_mfn);
Nothing prevents the guest to give back the page table whilst you
browsing it. You may want to have a look to patch "ARM: introduce
vgic_access_guest_memory()" [1] to generalize the function and avoid
re-inventing the wheel.
+
+ for ( ; ; level++ )
+ {
+ pte = table[offsets[level]];
You likely want to use ACCESS_ONCE here to prevent the compiler to read
the pte twice from the memory.
+
+ if ( level == 3 || !pte.walk.valid || !pte.walk.table )
+ break;
+
+ unmap_domain_page(table);
+
+ root_mfn = p2m_lookup(d, _gfn(pte.walk.base), NULL);
No check on root_mfn?
+ table = map_domain_page(root_mfn);
+ }
+
+ unmap_domain_page(table);
+
+ if ( !pte.walk.valid )
+ return -EFAULT;
+
+ /* Make sure the entry holds the requested access attributes. */
+ if ( ((flags & GV2M_WRITE) == GV2M_WRITE) && pte.pt.ro )
+ return -EFAULT;
IHMO, this function should return the access attribute of the page and
let the caller decides what to do. This would make this function more
generic.
+
+ *ipa = pfn_to_paddr(pte.walk.base) | (gva & masks[level]);
+
+ return 0;
+}
+
+
/*
* If mem_access is in use it might have been the reason why get_page_from_gva
* failed to fetch the page, as it uses the MMU for the permission checking.
@@ -109,9 +233,23 @@ p2m_mem_access_check_and_get_page(vaddr_t gva, unsigned
long flag,
struct page_info *page = NULL;
struct p2m_domain *p2m = &v->domain->arch.p2m;
+ ASSERT(p2m->mem_access_enabled);
+
Why this ASSERT has been added?
rc = gva_to_ipa(gva, &ipa, flag);
+
+ /*
+ * In case mem_access is active, hardware-based gva_to_ipa translation
+ * might fail. Since gva_to_ipa uses the guest's translation tables, access
+ * to which might be restricted by the active VTTBR, we perform a gva to
+ * ipa translation in software.
+ */
if ( rc < 0 )
- goto err;
+ if ( p2m_gva_to_ipa(p2m, gva, &ipa, flag) < 0 )
+ /*
+ * The software gva to ipa translation can still fail, if the the
+ * gva is not mapped or does not hold the requested access rights.
+ */
+ goto err;
Rather falling back, why don't we do software page table walk every time?
gfn = _gfn(paddr_to_pfn(ipa));
Cheers,
[1] https://patchwork.kernel.org/patch/9676291/
--
Julien Grall
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel