On 11.09.2019 12:57, Jan Beulich wrote:
> On 09.09.2019 17:35, Alexandru Stefan ISAILA wrote:
>> A/D bit writes (on page walks) can be considered benign by an introspection
>> agent, so receiving vm_events for them is a pessimization. We try here to
>> optimize by filtering these events out.
>> Currently, we are fully emulating the instruction at RIP when the hardware 
>> sees
>> an EPT fault with npfec.kind != npfec_kind_with_gla. This is, however,
>> incorrect, because the instruction at RIP might legitimately cause an
>> EPT fault of its own while accessing a _different_ page from the original 
>> one,
>> where A/D were set.
>> The solution is to perform the whole emulation, while ignoring EPT 
>> restrictions
>> for the walk part, and taking them into account for the "actual" emulating of
>> the instruction at RIP. When we send out a vm_event, we don't want the 
>> emulation
>> to complete, since in that case we won't be able to veto whatever it is 
>> doing.
>> That would mean that we can't actually prevent any malicious activity, 
>> instead
>> we'd only be able to report on it.
>> When we see a "send-vm_event" case while emulating, we need to first send the
>> event out and then suspend the emulation (return X86EMUL_RETRY).
>> After the emulation stops we'll call hvm_vm_event_do_resume() again after the
>> introspection agent treats the event and resumes the guest. There, the
>> instruction at RIP will be fully emulated (with the EPT ignored) if the
>> introspection application allows it, and the guest will continue to run past
>> the instruction.
>>
>> A common example is if the hardware exits because of an EPT fault caused by a
>> page walk, p2m_mem_access_check() decides if it is going to send a vm_event.
>> If the vm_event was sent and it would be treated so it runs the instruction
>> at RIP, that instruction might also hit a protected page and provoke a 
>> vm_event.
>>
>> Now if npfec.kind == npfec_kind_in_gpt and 
>> d->arch.monitor.inguest_pagefault_disabled
>> is true then we are in the page walk case and we can do this emulation 
>> optimization
>> and emulate the page walk while ignoring the EPT, but don't ignore the EPT 
>> for the
>> emulation of the actual instruction.
>>
>> In the first case we would have 2 EPT events, in the second case we would 
>> have
>> 1 EPT event if the instruction at the RIP triggers an EPT event.
>>
>> We use hvmemul_map_linear_addr() to intercept r/w access and
>> __hvm_copy() to intercept exec access.
> 
> Just like said for v8 - this doesn't look to match the implementation.
> 
>> hvm_emulate_send_vm_event() can return false if there was no violation,
>> if there was an error from monitor_traps() or p2m_get_mem_access().
>> Returning false if p2m_get_mem_access() fails is needed because the EPT
>> entry will have rwx memory access rights.
> 
> I have to admit I still don't understand this reasoning, but I
> guess I should leave it to the VM event maintainers to judge.
> In particular it's unclear to me why p2m_get_mem_access()
> failure would imply rwx access.
> 
>> --- a/xen/arch/x86/hvm/emulate.c
>> +++ b/xen/arch/x86/hvm/emulate.c
>> @@ -544,10 +544,11 @@ static void *hvmemul_map_linear_addr(
>>       struct hvm_emulate_ctxt *hvmemul_ctxt)
>>   {
>>       struct vcpu *curr = current;
>> -    void *err, *mapping;
>> +    void *err = NULL, *mapping;
> 
> As also said during v8 review, I don't think this (and the related)
> changes is needed anymore now that you've moved your new goto into
> the loop.

I thought it is simpler to init err with NULL but you are right there is 
no need for this in this patch. I will revert the changes.

> 
>> @@ -215,6 +217,79 @@ void hvm_monitor_interrupt(unsigned int vector, 
>> unsigned int type,
>>       monitor_traps(current, 1, &req);
>>   }
>>   
>> +/*
>> + * Send memory access vm_events based on pfec. Returns true if the event was
>> + * sent and false for p2m_get_mem_access() error, no violation and event 
>> send
>> + * error. Assumes the caller will check arch.vm_event->send_event.
>> + *
>> + * NOTE: p2m_get_mem_access() can fail if the entry was not found in the EPT
>> + * (in which case access to it is unrestricted, so no violations can occur).
>> + * In this cases it is fine to continue the emulation.
>> + */
>> +bool hvm_monitor_check_ept(unsigned long gla, gfn_t gfn, uint32_t pfec,
>> +                           uint16_t kind)
> 
> Why did you choose to have "ept" in the name and also mention it
> in the commit? Is there anything in here which isn't generic p2m?

The name was suggested by Razvan Cojocaru. I have no preference in the 
name. Maybe Tamas can suggest a good one.

> 
>> --- a/xen/arch/x86/mm/mem_access.c
>> +++ b/xen/arch/x86/mm/mem_access.c
>> @@ -212,8 +212,9 @@ bool p2m_mem_access_check(paddr_t gpa, unsigned long gla,
>>       }
>>       if ( vm_event_check_ring(d->vm_event_monitor) &&
>>            d->arch.monitor.inguest_pagefault_disabled &&
>> -         npfec.kind != npfec_kind_with_gla ) /* don't send a mem_event */
>> +         npfec.kind == npfec_kind_in_gpt ) /* don't send a mem_event */
>>       {
>> +        v->arch.vm_event->send_event = true;
> 
> Since I'm being puzzled every time I see this: The comment and
> the line you add look to be in curious disagreement. Do you
> perhaps want to extend it to include something like "right
> away", or make it e.g. "try to avoid sending a mem event"?
> Personally I think it wouldn't hurt to even mention the "why"
> here.

I agree, I will update that comment.

Thanks,
Alex
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Reply via email to