Hi Stefano,
On 27/07/16 02:12, Stefano Stabellini wrote:
On Wed, 20 Jul 2016, Julien Grall wrote:
The current implementation of flush_tlb_domain is relying on the domain
to have a single p2m. With the upcoming feature altp2m, a single domain
may have different p2m. So we would need to switch to the correct p2m in
order to flush the TLBs.
Rather than checking whether the domain is not the current domain, check
whether the VTTBR is different. The resulting assembly code is much
smaller: from 38 instructions (+ 2 functions call) to 22 instructions.
That's true but SYSREG reads are more expensive than regular
instructions.
This argument is not really true. The ARM ARM (D7-1879 in ARM DDI
0487A.j) says: "Reads of the System registers can occur out of order
with respect to earlier instructions executed on the same PE, provided
that any data dependencies between the instructions are respected". So
It will depend on how the micro-architecture implemented access to SYSREG.
However, the current code already contains plenty of SYSREG read access
(via the macro current using TPIDR_EL2). So the number of SYSREG
accesses stay exactly the same.
I also forgot to mention that the number of instructions in the function
call (10 instructions). So we are down from 58 instructions to 22
instructions.
Therefore, smaller code and likely better performance.
Signed-off-by: Julien Grall <julien.gr...@arm.com>
---
xen/arch/arm/p2m.c | 18 +++++++++++-------
1 file changed, 11 insertions(+), 7 deletions(-)
diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index d1b6009..015c1e8 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -151,24 +151,28 @@ void p2m_restore_state(struct vcpu *n)
void flush_tlb_domain(struct domain *d)
{
+ struct p2m_domain *p2m = &d->arch.p2m;
unsigned long flags = 0;
+ uint64_t ovttbr;
/*
- * Update the VTTBR if necessary with the domain d. In this case,
- * it's only necessary to flush TLBs on every CPUs with the current VMID
- * (our domain).
+ * ARM only provides an instruction to flush TLBs for the current
+ * VMID. So switch to the VTTBR of a given P2M if different.
*/
- if ( d != current->domain )
+ ovttbr = READ_SYSREG64(VTTBR_EL2);
+ if ( ovttbr != p2m->vttbr )
{
local_irq_save(flags);
- p2m_load_VTTBR(d);
+ WRITE_SYSREG64(p2m->vttbr, VTTBR_EL2);
+ isb();
}
flush_tlb();
- if ( d != current->domain )
+ if ( ovttbr != READ_SYSREG64(VTTBR_EL2) )
You should be able to remove this second SYSREG read and optimize the
code further.
I should be able, however I think it will not bring much more
optimization here but obfuscating a bit more the code.
Regards,
--
Julien Grall
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel