On 18/11/2022 21:10, Jason Andryuk wrote:
> On Fri, Nov 18, 2022 at 12:22 PM Andrew Cooper
> <andrew.coop...@citrix.com> wrote:
>> On 18/11/2022 14:39, Roger Pau Monne wrote:
>>> Nov 18 01:55:11.753936 (XEN) arch/x86/mm/hap/hap.c:304: d1 failed to 
>>> allocate from HAP pool
>>> Nov 18 01:55:18.633799 (XEN) Failed to shatter gfn 7ed37: -12
>>> Nov 18 01:55:18.633866 (XEN) d1v0 EPT violation 0x19c (--x/rw-) gpa 
>>> 0x0000007ed373a1 mfn 0x33ed37 type 0
>>> Nov 18 01:55:18.645790 (XEN) d1v0 Walking EPT tables for GFN 7ed37:
>>> Nov 18 01:55:18.645850 (XEN) d1v0  epte 9c0000047eba3107
>>> Nov 18 01:55:18.645893 (XEN) d1v0  epte 9c000003000003f3
>>> Nov 18 01:55:18.645935 (XEN) d1v0  --- GLA 0x7ed373a1
>>> Nov 18 01:55:18.657783 (XEN) domain_crash called from 
>>> arch/x86/hvm/vmx/vmx.c:3758
>>> Nov 18 01:55:18.657844 (XEN) Domain 1 (vcpu#0) crashed on cpu#8:
>>> Nov 18 01:55:18.669781 (XEN) ----[ Xen-4.17-rc  x86_64  debug=y  Not 
>>> tainted ]----
>>> Nov 18 01:55:18.669843 (XEN) CPU:    8
>>> Nov 18 01:55:18.669884 (XEN) RIP:    0020:[<000000007ed373a1>]
>>> Nov 18 01:55:18.681711 (XEN) RFLAGS: 0000000000010002   CONTEXT: hvm guest 
>>> (d1v0)
>>> Nov 18 01:55:18.681772 (XEN) rax: 000000007ed373a1   rbx: 000000007ed3726c  
>>>  rcx: 0000000000000000
>>> Nov 18 01:55:18.693713 (XEN) rdx: 000000007ed2e610   rsi: 0000000000008e38  
>>>  rdi: 000000007ed37448
>>> Nov 18 01:55:18.693775 (XEN) rbp: 0000000001b410a0   rsp: 0000000000320880  
>>>  r8:  0000000000000000
>>> Nov 18 01:55:18.705725 (XEN) r9:  0000000000000000   r10: 0000000000000000  
>>>  r11: 0000000000000000
>>> Nov 18 01:55:18.717733 (XEN) r12: 0000000000000000   r13: 0000000000000000  
>>>  r14: 0000000000000000
>>> Nov 18 01:55:18.717794 (XEN) r15: 0000000000000000   cr0: 0000000000000011  
>>>  cr4: 0000000000000000
>>> Nov 18 01:55:18.729713 (XEN) cr3: 0000000000400000   cr2: 0000000000000000
>>> Nov 18 01:55:18.729771 (XEN) fsb: 0000000000000000   gsb: 0000000000000000  
>>>  gss: 0000000000000002
>>> Nov 18 01:55:18.741711 (XEN) ds: 0028   es: 0028   fs: 0000   gs: 0000   
>>> ss: 0028   cs: 0020
>>>
>>> It seems to be related to the paging pool adding Andrew and Henry so
>>> that he is aware.
>> Summary of what I've just given on IRC/Matrix.
>>
>> This crash is caused by two things.  First
>>
>>   (XEN) FLASK: Denying unknown domctl: 86.
>>
>> because I completely forgot to wire up Flask for the new hypercalls.
>> But so did the original XSA-409 fix (as SECCLASS_SHADOW is behind
>> CONFIG_X86), so I don't feel quite as bad about this.
> Broken for ARM, but not for x86, right?

Specifically, the original XSA-409 fix broke Flask (on ARM only) by
introducing shadow domctl to ARM without making flask_shadow_control()
common.

I "fixed" that by removing ARM's use of shadow domctl, and broke it
differently by not adding Flask controls for the new common hypercalls.

> I think SECCLASS_SHADOW is available in the policy bits - it's just
> whether or not the hook functions are available?

I suspect so.

>> And second because libxl ignores the error it gets back, and blindly
>> continues onward.  Anthony has posted "libs/light: Propagate
>> libxl__arch_domain_create() return code" to fix the libxl half of the
>> bug, and I posted a second libxl bugfix to fix an error message.  Both
>> are very simple.
>>
>>
>> For Flask, we need new access vectors because this is a common
>> hypercall, but I'm unsure how to interlink it with x86's shadow
>> control.  This will require a bit of pondering, but it is probably
>> easier to just leave them unlinked.
> It sort of seems like it could go under domain2 since domain/domain2
> have most of the memory stuff, but it is non-PV.  shadow has its own
> set of hooks.  It could go in hvm which already has some memory stuff.

Having looked at all the proposed options, I'm going to put it in
domain/domain2.

This new hypercall is intentionally common, and applicable to all domain
types (eventually - x86 PV guests use this memory pool during migrate). 
Furthermore, it needs backporting along with all the other fixes to try
and make 409 work.

~Andrew

Reply via email to