On 5/13/19 11:15 AM, Andrii Anisov wrote:
Hello Julien,
On 08.05.19 16:59, Julien Grall wrote:
Hi,
On 23/04/2019 09:10, Andrii Anisov wrote:
From: Andrii Anisov <andrii_ani...@epam.com>
Following discussion [1] it is introduced and implemented a runstate
registration interface which uses guest's phys address instead of a
virtual one.
The new hypercall employes the same data structures as a predecessor,
but
expects the vcpu_runstate_info structure to not cross a page boundary.
The interface is implemented in a way vcpu_runstate_info structure is
mapped to
the hypervisor on the hypercall processing and is directly accessed
during its
updates. This runstate area mapping follows vcpu_info structure
registration.
Permanent mapping of runstate area would consume vmap area on arm32
what is
limited to 1G. Though it is assumed that ARM32 does not target the
server market
and the rest of possible applications will not host a huge number of
VCPUs to
render the limitation into the issue.
I am afraid I can't possible back this assumption. As I pointed out in
the previous version, I would be OK with the always map solution on
Arm32 (pending performance) because it would be possible to increase
the virtual address area by reworking the address space.
I'm sorry, I'm not sure what should be my actions about that.
There no code modification involved so far. Just updating your cover
letter with what I just said above.
The series is tested for ARM64. Build tested for x86. I'd appreciate
if someone
could check it with x86.
The Linux kernel patch is here [2]. Though it is for 4.14.
The patch looks wrong to me. You are using virt_to_phys() on a percpu
area. What does actually promise you the physical address will always
be the same?
Sorry for my ignorance here, could you please elaborate more about what
is wrong here?
While the virtual address will never change over over the life cycle of
a variable, I am not entirely sure we can make the same assumption for
the physical address.
I know that kmalloc() is promising you that the physical address will
not change. But percpu does not seem to use kmalloc() so have you
confirmed this assumption can hold?
Are you saying that the command dd is the CPUBurn? I am not sure how
this could be considered as a CPUBurn. IHMO, this is more IO related.
Both /dev/null and /dev/zero are virtual devices no actual IO is
performed during their operations, all the load is CPU (user and sys).
Thank you for the explanation. Shall I guess this is an existing
benchmark [1]?
VCPU(dX)->idle->VCPU(dX).
with following results:
mapped mapped
on access on init
GLMark2 320x240 2852 2877 +0.8%
+Dom0 CPUBurn 2088 2094 +0.2%
GLMark2 800x600 2368 2375 +0.3%
+Dom0 CPUBurn 1868 1921 +2.8%
GLMark2 1920x1080 931 931 0%
+Dom0 CPUBurn 892 894 +0.2%
Please note that "mapped on access" means using the old runstate
registering interface. And runstate update in this case still
often fails
to map runstate area like [5], despite the fact that our Linux
kernel
does not have KPTI enabled. So runstate area update, in this
case, is
really shortened.
We know that the old interface is broken, so telling us the new
interface is faster is not entirely useful. What I am more interested
is how it if you use a guest physical address on the version "Mapped
on access".
Hm, I see your point. Well, I can make it for ARM to compare performance.
Also it was checked IRQ latency difference using TBM in a
setup similar to
[5]. Please note that the IRQ rate is one in 30 seconds, and only
VCPU->idle->VCPU use-case is considered. With following
results (in ns,
the timer granularity 120ns):
How long did you run the benchmark?
I did run it until avg more ore less stabilizes (2-3 minutes), then took
the minimal avg (note, we have a moving average there).
Did you re-run multiple time?
mapped on access:
max=9960 warm_max=8640 min=7200 avg=7626
mapped on init:
max=9480 warm_max=8400 min=7080 avg=7341
Unfortunately there are no consitent results yet from
profiling using
Lauterbach PowerTrace. Still in communication with the tracer
vendor in
order to setup the proper configuration.
[1] https://patrickmn.com/projects/cpuburn/? If so, a link to the
benchmark
--
Julien Grall
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel