On 07.09.2021 09:58, Juergen Gross wrote:
> On 06.09.21 23:35, Sander Eikelenboom wrote:
>> L.S.,
>>
>> On my AMD box running:
>>      xen-unstable changeset: Fri Sep 3 15:10:43 2021 +0200 git:2d4978ead4
>>      linux kernel: 5.14.1
>>
>> With this setup I'm encountering some issues in dom0, see below.
>>
>> -- 
>> Sander
>>
>> xl dmesg gives:
>>
>> (XEN) [2021-09-06 18:15:04.089] mm.c:3506:d0v0 mfn 63b936 already pinned
>> (XEN) [2021-09-06 18:15:04.089] mm.c:3506:d0v0 mfn 6a0622 already pinned
>> (XEN) [2021-09-06 18:15:04.089] mm.c:3506:d0v0 mfn 6b63da already pinned
>> (XEN) [2021-09-06 18:15:04.089] mm.c:3506:d0v0 mfn 638dd9 already pinned
>> (XEN) [2021-09-06 18:15:04.089] mm.c:3506:d0v0 mfn 68a7bc already pinned
>> (XEN) [2021-09-06 18:15:04.089] mm.c:3506:d0v0 mfn 63c27d already pinned
>> (XEN) [2021-09-06 18:15:04.089] mm.c:3506:d0v0 mfn 6a04f2 already pinned
>> (XEN) [2021-09-06 18:15:04.089] mm.c:3506:d0v0 mfn 690d49 already pinned
>> (XEN) [2021-09-06 18:15:04.089] mm.c:3506:d0v0 mfn 6959a0 already pinned
>> (XEN) [2021-09-06 18:15:04.089] mm.c:3506:d0v0 mfn 6a055e already pinned
>> (XEN) [2021-09-06 18:15:04.090] mm.c:3506:d0v0 mfn 639437 already pinned
>>
>>
>> dmesg gives:
>>
>> [34321.304270] ------------[ cut here ]------------
>> [34321.304277] WARNING: CPU: 0 PID: 23628 at 
>> arch/x86/xen/multicalls.c:102 xen_mc_flush+0x176/0x1a0
>> [34321.304288] Modules linked in:
>> [34321.304291] CPU: 0 PID: 23628 Comm: apt-get Not tainted 
>> 5.14.1-20210906-doflr-mac80211debug+ #1
>> [34321.304294] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS 
>> V1.8B1 09/13/2010
>> [34321.304296] RIP: e030:xen_mc_flush+0x176/0x1a0
>> [34321.304300] Code: 89 45 18 48 c1 e9 3f 48 89 ce e9 20 ff ff ff e8 60 
>> 03 00 00 66 90 5b 5d 41 5c 41 5d c3 48 c7 45 18 ea ff ff ff be 01 00 00 
>> 00 <0f> 0b 8b 55 00 48 c7 c7 10 97 aa 82 31 db 49 c7 c5 38 97 aa 82 65
>> [34321.304303] RSP: e02b:ffffc90000a97c90 EFLAGS: 00010002
>> [34321.304305] RAX: ffff88807d416398 RBX: ffff88807d416350 RCX: 
>> ffff88807d416398
>> [34321.304306] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 
>> deadbeefdeadf00d
>> [34321.304308] RBP: ffff88807d416300 R08: aaaaaaaaaaaaaaaa R09: 
>> ffff888006160cc0
>> [34321.304309] R10: deadbeefdeadf00d R11: ffffea000026a600 R12: 
>> 0000000000000000
>> [34321.304310] R13: ffff888012f6b000 R14: 0000000012f6b000 R15: 
>> 0000000000000001
>> [34321.304320] FS:  00007f5071177800(0000) GS:ffff88807d400000(0000) 
>> knlGS:0000000000000000
>> [34321.304322] CS:  10000e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [34321.304323] CR2: 00007f506f542000 CR3: 00000000160cc000 CR4: 
>> 0000000000000660
>> [34321.304326] Call Trace:
>> [34321.304331]  xen_alloc_pte+0x294/0x320
>> [34321.304334]  move_pgt_entry+0x165/0x4b0
>> [34321.304339]  move_page_tables+0x6fa/0x8d0
>> [34321.304342]  move_vma.isra.44+0x138/0x500
>> [34321.304345]  __x64_sys_mremap+0x296/0x410
>> [34321.304348]  do_syscall_64+0x3a/0x80
>> [34321.304352]  entry_SYSCALL_64_after_hwframe+0x44/0xae
>> [34321.304355] RIP: 0033:0x7f507196301a
> 
> I can see why this failure is occurring, but I'm not sure which way is
> the best to fix it.
> 
> The problem is that a pinned page table is moved: the pmd entry
> referencing it is cleared and a new reference is put into the pmd.
> This is done by getting the old pmd entry, clearing that entry, and then
> using pmd_populate() to write the new pmd entry. pmd_populate() will
> lead to a call of xen_pte_alloc() trying to pin the referenced page
> table, which is failing, as it is already pinned.
> 
> The problem has been introduced by commit 0881ace292b662d2 in kernel
> 5.14.
> 
> Following solutions would be possible:
> 
> 1. When running as PV guest skip the optimization of move_pgt_entry()
>     by letting arch_supports_page_table_move() return false. This will
>     result in a performance drop in some cases.
> 
> 2. Unpin the page table before calling pmd_populate(). This adds some
>     unneeded hypercall and without flushing the TLB I'm feeling uneasy
>     to do that.

I agree as far as the "unneeded hypercall" aspect goes, but I don't
see any connection to the TLB (or a need to flush it): Pinning has
nothing to do with insertion into a live page table; a pinned page
table can be entirely free floating. It's the removal from a
(possibly) live page table which would require a flush.

> 3. Add a check in xen_pte_alloc() if the page table is pinned already
>     and if this is the case, don't do the pinning. This is a rather clean
>     solution, but will result in other failures if a page table is used
>     multiple times (this case would be caught today as in the failure
>     above).
> 
> My tendency is towards solution 3 as it is local to Xen code and has the
> best performance.

I agree 3 looks most promising. I can't judge how big of a risk
there is for a page table to get used in more than one place, and
hence how important it is to be able to detect that case.

Jan


Reply via email to