On 13/08/2020 19:41, Oleksandr wrote:
Rebooting domain 2
root@generic-armv8-xt-dom0:~# (XEN) Xen BUG at
...tAUTOINC+bb71237a55-r0/git/xen/include/xen/mm.h:683
(XEN) ----[ Xen-4.14.0 arm64 debug=y Not tainted ]----
(XEN) CPU: 3
(XEN) PC: 0000000000246f28 ioreq.c#hvm_free_ioreq_mfn+0x68/0x6c
(XEN) LR: 0000000000246ef0
(XEN) SP: 0000800725eafd80
(XEN) CPSR: 60000249 MODE:64-bit EL2h (Hypervisor, handler)
(XEN) X0: 0000000000000001 X1: 403fffffffffffff X2: 000000000000001f
(XEN) X3: 0000000080000000 X4: 0000000000000000 X5: 0000000000400000
(XEN) X6: 0000800725eafe24 X7: 0000ffffd1ef3e08 X8: 0000000000000020
(XEN) X9: 0000000000000000 X10: 00e800008ecebf53 X11: 0400000000000000
(XEN) X12: ffff7e00013b3ac0 X13: 0000000000000002 X14: 0000000000000001
(XEN) X15: 0000000000000001 X16: 0000000000000029 X17: 0000ffff9badb3d0
(XEN) X18: 000000000000010f X19: 0000000810e60e38 X20: 0000800725e68ec0
(XEN) X21: 0000000000000000 X22: 00008004dc0404a0 X23: 000000005a000ea1
(XEN) X24: ffff8000460ec280 X25: 0000000000000124 X26: 000000000000001d
(XEN) X27: ffff000008ad1000 X28: ffff800052e65100 FP: ffff0000223dbd20
(XEN)
(XEN) VTCR_EL2: 80023558
(XEN) VTTBR_EL2: 0002000765f04000
(XEN)
(XEN) SCTLR_EL2: 30cd183d
(XEN) HCR_EL2: 000000008078663f
(XEN) TTBR0_EL2: 00000000781c5000
(XEN)
(XEN) ESR_EL2: f2000001
(XEN) HPFAR_EL2: 0000000000030010
(XEN) FAR_EL2: ffff000008005f00
(XEN)
(XEN) Xen stack trace from sp=0000800725eafd80:
(XEN) 0000800725e68ec0 0000000000247078 00008004dc040000
00000000002477c8
(XEN) ffffffffffffffea 0000000000000001 ffff8000460ec500
0000000000000002
(XEN) 000000000024645c 00000000002462dc 0000800725eafeb0
0000800725eafeb0
(XEN) 0000800725eaff30 0000000060000145 000000000027882c
0000800725eafeb0
(XEN) 0000800725eafeb0 01ff00000935de80 00008004dc040000
0000000000000006
(XEN) ffff800000000000 0000000000000002 000000005a000ea1
000000019bc60002
(XEN) 0000ffffd1ef3e08 0000000000000020 0000000000000004
000000000027c7d8
(XEN) 000000005a000ea1 0000800725eafeb0 000000005a000ea1
0000000000279f98
(XEN) 0000000000000000 ffff8000460ec200 0000800725eaffb8
0000000000262c58
(XEN) 0000000000262c4c 07e0000160000249 0000000000000002
0000000000000001
(XEN) ffff8000460ec500 ffff8000460ec508 ffff8000460ec208
ffff800052e65100
(XEN) 000000005060b478 0000ffffd20f3000 ffff7e00013c77e0
0000000000000000
(XEN) 00e800008ecebf53 0400000000000000 ffff7e00013b3ac0
0000000000000002
(XEN) 0000000000000001 0000000000000001 0000000000000029
0000ffff9badb3d0
(XEN) 000000000000010f ffff8000460ec210 ffff8000460ec200
ffff8000460ec210
(XEN) 0000000000000001 ffff8000460ec500 ffff8000460ec280
0000000000000124
(XEN) 000000000000001d ffff000008ad1000 ffff800052e65100
ffff0000223dbd20
(XEN) ffff000008537004 ffffffffffffffff ffff0000080c17e4
5a000ea160000145
(XEN) 0000000060000000 0000000000000000 0000000000000000
ffff800052e65100
(XEN) ffff0000223dbd20 0000ffff9badb3dc 0000000000000000
0000000000000000
(XEN) Xen call trace:
(XEN) [<0000000000246f28>] ioreq.c#hvm_free_ioreq_mfn+0x68/0x6c (PC)
(XEN) [<0000000000246ef0>] ioreq.c#hvm_free_ioreq_mfn+0x30/0x6c (LR)
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 3:
(XEN) Xen BUG at ...tAUTOINC+bb71237a55-r0/git/xen/include/xen/mm.h:683
(XEN) ****************************************
(XEN)
(XEN) Reboot in five seconds...
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) PSCI cpu off failed for CPU0 err=-3
(XEN) ****************************************
(XEN)
(XEN) Reboot in five seconds...
Either I did something wrong (most likely) or there is an issue with
page ref-counting in the IOREQ code. I am still trying to understand
what is going on.
At a first glance, the implement of set_foreign_p2m_entry() looks fine
to me.
Some notes on that:
1. I checked that put_page() was called for these pages in
p2m_put_l3_page() when destroying domain. This happened before
hvm_free_ioreq_mfn() execution.
2. There was no BUG detected if I passed "p2m_ram_rw" instead of
"p2m_map_foreign_rw" in guest_physmap_add_entry(), but the DomU couldn't
be fully destroyed because of the reference taken.
This definitely looks like a page reference issue. Would it be possible
to print where the page reference are dropped? A WARN() in put_page()
would help.
To avoid a lot of message, I tend to use a global variable that store
the page I want to watch.
Cheers,
--
Julien Grall