(+ Andre and Stefano)
On 20/07/2020 15:53, Alejandro wrote:
Hello all.
Hello,
I'm new to this community, and firstly I'd like to thank you all for
your efforts on supporting Xen in ARM devices.
Welcome to the community!
I'm trying Xen 4.13.1 in a Allwinner H6 SoC (more precisely a Pine H64
model B, with a ARM Cortex-A53 CPU).
I managed to get a dom0 Linux 5.8-rc5 kernel running fine, unpatched,
and I'm using the upstream device tree for
my board. However, the dom0 kernel has trouble when reading some DT
nodes that are related to the CPUs, and
it can't initialize the thermal subsystem properly, which is a kind of
showstopper for me, because I'm concerned
that letting the CPU run at the maximum frequency without watching out
its temperature may cause overheating.
I understand this concern, I am aware of some efforts to get CPUFreq
working on Xen but I am not sure if there is anything available yet. I
have CCed a couple of more person that may be able to help here.
The relevant kernel messages are:
[ +0.001959] sun50i-cpufreq-nvmem: probe of sun50i-cpufreq-nvmem
failed with error -2
...
[ +0.003053] hw perfevents: failed to parse interrupt-affinity[0] for pmu
[ +0.000043] hw perfevents: /pmu: failed to register PMU devices!
[ +0.000037] armv8-pmu: probe of pmu failed with error -22
I am not sure the PMU failure is related to the thermal failure below.
...
[ +0.000163] OF: /thermal-zones/cpu-thermal/cooling-maps/map0: could
not find phandle
[ +0.000063] thermal_sys: failed to build thermal zone cpu-thermal: -22
Would it be possible to paste the device-tree node for
/thermal-zones/cpu-thermal/cooling-maps? I suspect the issue is because
we recreated /cpus from scratch.
I don't know much about how the thermal subsystem works, but I suspect
this will not be enough to get it working properly on Xen. For a
workaround, you would need to create a dom0 with the same numbers of
vCPU as the numbers of pCPUs. They would also need to be pinned.
I will leave the others to fill in more details.
I've searched for issues, code or commits that may be related for this
issue. The most relevant things I found are:
- A patch that blacklists the A53 PMU:
https://patchwork.kernel.org/patch/10899881/
- The handle_node function in xen/arch/arm/domain_build.c:
https://github.com/xen-project/xen/blob/master/xen/arch/arm/domain_build.c#L1427
I remember this discussion. The problem was that the PMU is using
per-CPU interrupts. Xen is not yet able to handle PPIs as they often
requires more context to be saved/restored (in this case the PMU context).
There was a proposal to look if a device is using PPIs and just remove
them from the Device-Tree. Unfortunately, I haven't seen any official
submission for this patch.
Did you have to apply the patch to boot up? If not, then the error above
shouldn't be a concern. However, if you need PMU support for the using
thermal devices then it is going to require some work.
I've thought about removing "/cpus" from the skip_matches array in the
handle_node function, but I'm not sure
that would be a good fix.
The node "/cpus" and its sub-node are recreated by Xen for Dom0. This is
because Dom0 may have a different numbers of vCPUs and it doesn't seen
the pCPUs.
If you don't skip "/cpus" from the host DT then you would end up with
two "/cpus" path in your dom0 DT. Mostly likely, Linux will not be happy
with it.
I vaguely remember some discussions on how to deal with CPUFreq in Xen.
IIRC we agreed that Dom0 should be part of the equation because it
already contains all the drivers. However, I can't remember if we agreed
how the dom0 would be made aware of the pCPUs.
@Volodymyr, I think you were looking at CPUFreq. Maybe you can help?
Best regards,
--
Julien Grall