Hi Andrew, Tamas,
Le mercredi, mai 29, 2019 2:49 AM, Andrew Cooper <andrew.coop...@citrix.com> a
écrit :
> After a series of tests on 1 or 4 VCPUs, my domain end up in 2 possible
> states: - frozen: the mouse doesn't move: so I would guess the VCPU are
> blocked. I'm calling the xc_(un)pause_domain APIs multiple times when I write
> to the shadow copies, but It's always synchronous, so I doubt that they
> interfered and "paused" the domain.
>
> xc_{,un}pause_domain() calls are reference counted. Calling unpause too many
> times should be safe (from a refcount point of view), and should fail the
> hypercall with -EINVAL.
>
>> Also, the log output I have before I detect that Ansible failed to execute
>> is that the resume succeded and
>> Xen is ready to process VMI events.
>>
>> - BSOD: that's the second possibility, apparently I'm corrupting critical
>> data structure in the operating system,
>> and the Windbg analysis is inconclusive, so I can't tell much.
>>
>> Either way, I can't execute this test sequentially 10 000 times without a
>> crash.
>
> Ok good - this is a far easier place to start debugging from.
>
>> -> Could you look at the implementation, and tell me if I misused the APIs
>> somewhere ?
>> https://gist.github.com/mtarral/d99ce5524cfcfb5290eaa05702c3e8e7
>
> Some observations.
>
> 1) In xen_pause_vm(), you do an xc_domain_getinfo(). First of all, the API
> is crazy, so you also need to check "|| info.domid != domid" in your error
> condition, but that shouldn't be an issue here as the domid isn't changing.
"The API is crazy"
Could you elaborate on that ?
> Furthermore, the results of xc_domain_getinfo() are stale before the
> hypercall returns, so it is far less useful than it appears.
I looked at libvmi's implementation:
https://github.com/libvmi/libvmi/blob/master/libvmi/driver/xen/xen.c#L2696
I guess I should have implemented the checks too.
They just didn't make sense for me as I was sure that my calls were
synchronous, one after the other.
> I'm afraid that the only safe way to issue pause/unpauses is to know that
> you've reference counted your own correctly. All entities in dom0 with
> privilege can fight over each others references, because there is nothing Xen
> can use to distinguish the requests.
-> I removed my calls to xc_pause/resume.
> 2) You allocate a libxl context but do nothing with it. That can all go,
> along with the linkage against libxl. Also, you don't need to create a
> logger like that. Despite being utterly unacceptable behaviour for a
> library, it is the default by passing NULL in xc_interface_open().
I followed drakvuf's xen init function:
https://github.com/tklengyel/drakvuf/blob/master/src/xen_helper/xen_helper.c#L140
As I thought I was going to need this at some point.
Same for the xl_logger initialization.
> 3) A malloc()/memset() pair is more commonly spelt calloc()
True.
> And some questions.
>
> 1) I'm guessing the drakvuf_inject_trap(drakvuf, 0x293e6a0, 0) call is
> specific to the exact windows kernel in use?
Yes, I used a libvmi python script to tanslate the symbol -> virtual address ->
physical address.
Then I replaced that value in my code and recompiled the binary before the test.
> 2) In vmi_init(), what is the purpose of fmask and zero_page_gfn? You add
> one extra gfn to the guest, called zero_page, and fill it with 1's from fmask.
>
> 3) You create two altp2m's, but both have the same default access. Is this
> deliberate, or a bug? If deliberate, why?
>
> Finally, and probably the source of the memory corruption...
>
> 4) When injecting a trap, you allocate a new gfn, memcpy() the contents and
> insert a 0xcc (so far so good). You then remap the executable view to point
> at the new gfn with a breakpoint in (fine), and remap the readable view to
> point at the zero_page, which is full of 1's (uh-oh).
>
> What is this final step trying to achieve? It guarantees that patch-guard
> will eventually notice and BSOD your VM for critical structure corruption.
> The read-only view needs to point to the original gfn with only read
> permissions, so when Windows reads the gfn back, it sees what it expects.
> You also need to prohibit writes to either gfn so you can spot writes
> (unlikely in this case but important for general introspection) so you can
> propagate the change to both copies.
Yes I missed that PatchGuard would eventually check those shadow pages anyway.
I was already happy to see that my breakpoints were working, and I proceeded to
the tests
hoping to have a quick reproduction of the bug.
I implemented a basic mem_access event on the restricting to --X only on the
original GFN being remapped,
and switching to hostp2m and singlestepping to escape PatchGuard.
It works, but I end up in a situation where Xen fails at some point, because at
~90 tests, it cannot populate the ring anymore:
INFO:root:==== test 92 ====
INFO:root:starting drakvuf
INFO:root:starting Ansible
INIT
xen_init_interface
xc_interface_open
create logger
allocating libxc context
init ring page
xc: error: Failed to populate ring pfn
(16 = Device or resource busy): Internal error
fail to enable monitoring: Device or resource busy
fail to init xen interface
CLOSE
Fail to init vmi
(I updated the Gist:
https://gist.github.com/mtarral/d99ce5524cfcfb5290eaa05702c3e8e7#file-xen-drakvuf-c)
What do you think happened ?
I have a call to xc_domain_setmaxmem with ~0, so it shouldn't happen ?
https://gist.github.com/mtarral/d99ce5524cfcfb5290eaa05702c3e8e7#file-xen-drakvuf-c-L598
>> I used the compat APIs, like Drakvuf does.
>>
>> @Tamas, if you could check the traps implementation.
>>
>> You also have stress-test.py, which is the small test suite that I used, and
>> the screenshot showing the stdout preceding a test failure,
>> when Ansible couldn't contact WinRM service because the domain was frozen.
>>
>> Note: I stole some code from libvmi, to handle page read/write in Xen.
>>
>> PS: in the case where the domain is frozen, and I destroy the domain, a
>> (null) entry will remain
>> in xl list, despite that my stress-test.py process is already dead.
>>
>> I have 4 of these entries in my xl list right now.
>
> That's almost certainly a reference not being dropped on a page. Can you run
> `xl debug-keys q` and paste the resulting analysis which will be visible in
> `xl dmesg`?
>
> It is probably some missing cleanup in the altp2m code.
I just checked, and I had a few "xen-drakvuf" processes still running in the
background.
When the main Python process raised an exception and terminated, they became
attached to systemd.
Killing them removed the (null) domain entries.
Thanks,
Mathieu
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel