Re: [Xen-devel] [PATCH RESEND v1 1/7] x86: add a flag to enable Intel processor trace

2018-04-27 Thread Kang, Luwei
> > diff --git a/docs/misc/xen-command-line.markdown
> > b/docs/misc/xen-command-line.markdown
> > index 781110d..95411cf 100644
> > --- a/docs/misc/xen-command-line.markdown
> > +++ b/docs/misc/xen-command-line.markdown
> > @@ -1009,6 +1009,13 @@ debug hypervisor only).
> >  ### idle\_latency\_factor
> >  > `= `
> >
> > +### intel\_pt
> > +> `= `
> > +
> > +> Default: `true`
> > +
> 
> After reading the manual a bit I think this option needs to be more
> sophisticated.
> 
> The series only implements guest-only tracing, while in the future we might
> want host-only tracing and system wide tracing.
> 
> Even the other modes aren't implemented yet we should leave room for
> them.
> 

Hi Wei,
 Thanks for the review. So what about define guest mode like this and make 
the option as a string. Other mode can be added like 'system | host' in future.

### Intel\_pt
> `= guest`



By the way, you mentioned in another said that,  "No document for this option 
here?". Do you mean the description is too simple in this patch or I need add 
some words in a another document?

Thanks,
Luwei Kang

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RESEND v1 5/7] x86: Implement Intel Processor Trace context switch

2018-04-27 Thread Kang, Luwei
> > Load/Store Intel processor trace register in context switch.
> > MSR IA32_RTIT_CTL is loaded/stored automatically from VMCS.
> > When Intel PT is supported in guest, we need load/restore PT MSRs only
> > when PT is enabled in guest.
> >
> > Signed-off-by: Luwei Kang 
> 
> Is there a reason to not use xsaves/xrstors when they are available?
> 
There have two method to implement context switch(manual and xsave/xrstors).
The first method is more directly and also won't have any performance overhead 
if intel PT is disabled.
If use xsave/xrstors we need to check if it available and whether PT is 
supported  in XSS (CPUID.0D(ecx=1).ecx).
I will think about this scenario and may make an independent patch to enable 
it. 
But I think manual save/restore will also needed in case xsave/rstores not 
available, although I think this may can't be happened.

Thanks,
Luwei Kang

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RESEND v1 1/7] x86: add a flag to enable Intel processor trace

2018-04-27 Thread Kang, Luwei
> > --- a/docs/misc/xen-command-line.markdown
> > +++ b/docs/misc/xen-command-line.markdown
> > @@ -1009,6 +1009,13 @@ debug hypervisor only).
> >  ### idle\_latency\_factor
> >  > `= `
> >
> > +### intel\_pt
> > +> `= `
> > +
> > +> Default: `true`
> > +
> > +Flag to enable Intel Processor Trace.
> 
> I agree with what Wei has said. In addition please use dashes in preference
> to underscores for both command line options and file names (neither of the
> two is constrained by C identifier naming restrictions). That said, I'm afraid
> "pt" is an acronym we commonly associate with pass-through, so for all of file
> names, command line option, and identifiers I'd like to ask for an alternative
> to be found.
> "ptrace" may be an option, as - other than e.g. Linux - we don't associate any
> meaning to it.
> 
Hi Jan,
   Thanks for you review. "ptrace" make me associate "strace", "ftrace". 
Although they are complete  different things but I think "ptrace" is not good 
enough to present "Intel Processor Trace".

Thanks,
Luwei Kang

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RESEND v1 1/7] x86: add a flag to enable Intel processor trace

2018-04-27 Thread Kang, Luwei
> >>> On 27.04.18 at 10:22,  wrote:
> >> > diff --git a/docs/misc/xen-command-line.markdown
> >> > b/docs/misc/xen-command-line.markdown
> >> > index 781110d..95411cf 100644
> >> > --- a/docs/misc/xen-command-line.markdown
> >> > +++ b/docs/misc/xen-command-line.markdown
> >> > @@ -1009,6 +1009,13 @@ debug hypervisor only).
> >> >  ### idle\_latency\_factor
> >> >  > `= `
> >> >
> >> > +### intel\_pt
> >> > +> `= `
> >> > +
> >> > +> Default: `true`
> >> > +
> >>
> >> After reading the manual a bit I think this option needs to be more
> >> sophisticated.
> >>
> >> The series only implements guest-only tracing, while in the future we
> >> might want host-only tracing and system wide tracing.
> >>
> >> Even the other modes aren't implemented yet we should leave room for
> >> them.
> >>
> >
> > Hi Wei,
> >  Thanks for the review. So what about define guest mode like this
> > and make the option as a string. Other mode can be added like 'system
> > | host' in future.
> >
> > ### Intel\_pt
> >> `= guest`
> 
> That's still too simple, as you implement it for HVM only. While it's not 
> clear
> to me whether it could be done properly for PV, that path shouldn't be
> closed unless it's crystal clear that it's impossible to ever happen.

The great change in Intel Processor Trace improvements is CPU will treat PT 
output addresses as Guest Physical Addresses (GPAs) and translate them using 
EPT. So if EPT is not supported or disabled in PV guest, intel PT feature can't 
be exposed to guest.
System wide mode may have a little different, we can disable intel address 
translated by EPT and enable PT in dom0  to record the behavior of hypervisor 
and  guest. Although Intel PT is still enabled(IA32_RTIT_CTL.EN=1) in guest but 
actually guest don't aware this. This mode is not implemented in this patch set 
and may need more discussion in next step.

> 
> Jan
> 


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RESEND v1 1/7] x86: add a flag to enable Intel processor trace

2018-04-27 Thread Kang, Luwei
> >>> On 27.04.18 at 11:01,  wrote:
> >Thanks for you review. "ptrace" make me associate "strace", "ftrace".
> > Although they are complete  different things but I think "ptrace" is
> > not good enough to present "Intel Processor Trace".
> 
> Then how about ipt instead of just pt?

"ipt" is looks good to me.

Thanks,
Luwei Kang

> 
> Jan
> 


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RESEND v1 2/7] x86: configure vmcs for Intel processor trace virtualization

2018-04-27 Thread Kang, Luwei
> > This patch configure VMCS to make Intel PT output address can be treat
> > as guest physical address and translated by EPT when intel_pt option
> > is true.
> > There have some constraint condition on VMCS configuration, otherwise
> > will cause VM entry failed.
> >
> > 1. If the “Guest PT uses Guest Physical Addresses” execution
> >control is 1, the “Clear IA32_RTIT_CTL on exit” exit
> >control and the “Load IA32_RTIT_CTL on entry” entry
> >control must also be 1.
> > 2. If the “Guest PT uses Guest Physical Addresses” execution
> >control is 1, the "enable EPT" execution control must
> >also be 1.
> 
> What are the implications for a guest running with hap=0?
> 

Intel PT enabling in guest depend on EPT feature. So if hap=0 Intel PT will 
disabled in guest.

> > @@ -383,13 +388,28 @@ static int vmx_init_vmcs_config(void)
> >  _vmx_secondary_exec_control &=
> > ~SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;
> >
> >  min = 0;
> > -opt = VM_ENTRY_LOAD_GUEST_PAT | VM_ENTRY_LOAD_BNDCFGS;
> > +opt = VM_ENTRY_LOAD_GUEST_PAT | VM_ENTRY_LOAD_BNDCFGS |
> > +  VM_ENTRY_CONCEAL_PT_PIP | VM_ENTRY_LOAD_IA32_RTIT_CTL;
> >  _vmx_vmentry_control = adjust_vmx_controls(
> >  "VMEntry Control", min, opt, MSR_IA32_VMX_ENTRY_CTLS,
> > &mismatch);
> >
> >  if ( mismatch )
> >  return -EINVAL;
> >
> > +if ( !(_vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_EPT)
> ||
> > + !(_vmx_secondary_exec_control & SECONDARY_EXEC_PT_USE_GPA)
> ||
> > + !(_vmx_vmexit_control & VM_EXIT_CLEAR_IA32_RTIT_CTL) ||
> > + !(_vmx_vmentry_control & VM_ENTRY_LOAD_IA32_RTIT_CTL) )
> > +{
> > +_vmx_secondary_exec_control &=
> ~(SECONDARY_EXEC_PT_USE_GPA |
> > + SECONDARY_EXEC_CONCEAL_PT_PIP);
> > +_vmx_vmexit_control &= ~(VM_EXIT_CONCEAL_PT_PIP |
> > + VM_EXIT_CLEAR_IA32_RTIT_CTL);
> > +_vmx_vmentry_control &= ~(VM_ENTRY_CONCEAL_PT_PIP |
> > +  VM_ENTRY_LOAD_IA32_RTIT_CTL);
> > +opt_intel_pt = 0;
> > +}
> 
> Besides clearing the flag here, shouldn't you also check it further up?

If " opt_intel_pt =0" represent user don't want to use this feature to all 
guest or hardware don't support it at all. If flag "opt_intel_pt " still true 
after this check represent the user want to use this feature and hardware have 
capability to support PT in guest.  This is depend on hardware capability and 
the parameter set of xen command line "ipt=1".

If " opt_intel_pt = 1" but a new guest created by "hap=0" in xl.cfg. Intel PT 
will be make off in this guest and set flag " pt->intel_pt_enabled = false". 
This flag is VM specific and can't affect create another new guest have Intel 
PT with "hap=1". I am not sure if this is what you concern.

Thanks,
Luwei Kang

> 
> Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RESEND v1 4/7] x86: add intel processor trace context

2018-04-27 Thread Kang, Luwei
> > --- a/xen/include/asm-x86/hvm/vmx/vmcs.h
> > +++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
> > @@ -20,6 +20,7 @@
> >
> >  #include 
> >  #include 
> > +#include 
> >
> >  extern void vmcs_dump_vcpu(struct vcpu *v);  extern void
> > setup_vmcs_dump(void); @@ -171,6 +172,8 @@ struct arch_vmx_struct {
> >   * pCPU and wakeup the related vCPU.
> >   */
> >  struct pi_blocking_vcpu pi_blocking;
> > +
> > +struct pt_desc   pt_desc;
> 
> Perhaps better a pointer, to limit struct vcpu growth. The structure also
> doesn't actually need allocating unless the feature is present and the
> command line option allows its use. Once a pointer, you also
> - don't need to include the header above.
> - can size the allocation based on actual number of address MSRs
>   supported, instead of ...

Agree, will fix in next version.

> 
> > --- a/xen/include/asm-x86/intel_pt.h
> > +++ b/xen/include/asm-x86/intel_pt.h
> > @@ -21,6 +21,23 @@
> >  #ifndef __ASM_X86_HVM_INTEL_PT_H_
> >  #define __ASM_X86_HVM_INTEL_PT_H_
> >
> > +#include 
> > +
> > +struct pt_ctx {
> > +u64 ctl;
> > +u64 status;
> > +u64 output_base;
> > +u64 output_mask;
> > +u64 cr3_match;
> > +u64 addr[NUM_MSR_IA32_RTIT_ADDR];
> 
> ... this fixed value.
> 
> > +};
> > +
> > +struct pt_desc {
> > +bool intel_pt_enabled;
> > +unsigned int addr_num;
> > +struct pt_ctx guest_pt_ctx;
> > +};
> 
> Any particular reason to make this two structures instead of one?

If add the support of host guest mode(trace hypervisor and guest independent), 
new item can be added in the structure like " struct pt_ctx host_pt_ctx".

> 
> > --- a/xen/include/asm-x86/msr-index.h
> > +++ b/xen/include/asm-x86/msr-index.h
> > @@ -529,4 +529,24 @@
> >  #define MSR_PKGC9_IRTL 0x0634
> >  #define MSR_PKGC10_IRTL0x0635
> >
> > +/* Intel PT MSRs */
> > +#define MSR_IA32_RTIT_CTL  0x0570
> > +#define _MSR_IA32_RTIT_CTL_TRACEEN 0
> 
> Please can you avoid further extending the set of name space violations?
> Names starting with an underscore and an upper case letter are reserved.
> Unless you really need the bit position values, defining just the mask values
> ought to suffice.

Get it. Will fix it.

> 
> > +#define MSR_IA32_RTIT_CTL_TRACEEN  (1ULL <<
> _MSR_IA32_RTIT_CTL_TRACEEN)
> > +#define _MSR_IA32_RTIT_CTL_TOPA8
> > +#define MSR_IA32_RTIT_CTL_TOPA (1ULL <<
> _MSR_IA32_RTIT_CTL_TOPA)
> > +#define MSR_IA32_RTIT_STATUS   0x0571
> > +#define MSR_IA32_RTIT_CR3_MATCH0x0572
> > +#define MSR_IA32_RTIT_OUTPUT_BASE  0x0560
> > +#define MSR_IA32_RTIT_OUTPUT_MASK  0x0561
> > +#define MSR_IA32_RTIT_ADDR0_A  0x0580
> > +#define MSR_IA32_RTIT_ADDR0_B  0x0581
> > +#define MSR_IA32_RTIT_ADDR1_A  0x0582
> > +#define MSR_IA32_RTIT_ADDR1_B  0x0583
> > +#define MSR_IA32_RTIT_ADDR2_A  0x0584
> > +#define MSR_IA32_RTIT_ADDR2_B  0x0585
> > +#define MSR_IA32_RTIT_ADDR3_A  0x0586
> > +#define MSR_IA32_RTIT_ADDR3_B  0x0587
> 
> I'd prefer a single pair of
> 
> +#define MSR_IA32_RTIT_ADDR_A(n)  (0x0580 + (n) * 2)
> +#define MSR_IA32_RTIT_ADDR_B(n)  (0x0581 + (n) * 2)
> 
> > +#define NUM_MSR_IA32_RTIT_ADDR 8
> 
> The 3-bit value in CPUID output can enumerate up to 7 pairs. I'm not
> convinced hard coding a lower limit is desirable (unless I haven't been able 
> to
> spot where in the SDM such a lower limit is mandated).

Agree, looks good to me.

Thanks,
Luwei Kang

> 
> Jan
> 


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RESEND v1 5/7] x86: Implement Intel Processor Trace context switch

2018-04-27 Thread Kang, Luwei
> > --- a/xen/arch/x86/cpu/intel_pt.c
> > +++ b/xen/arch/x86/cpu/intel_pt.c
> > @@ -21,7 +21,76 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> > +#include 
> >
> >  /* intel_pt: Flag to enable Intel Processor Trace (default on). */
> > bool_t __read_mostly opt_intel_pt = 1;  boolean_param("intel_pt",
> > opt_intel_pt);
> > +
> > +static inline void pt_load_msr(struct pt_ctx *ctx, u32 addr_num)
> 
> const
> 
> > +{
> > +u32 i;
> > +wrmsrl(MSR_IA32_RTIT_STATUS, ctx->status);
> > +wrmsrl(MSR_IA32_RTIT_OUTPUT_BASE, ctx->output_base);
> > +wrmsrl(MSR_IA32_RTIT_OUTPUT_MASK, ctx->output_mask);
> > +wrmsrl(MSR_IA32_RTIT_CR3_MATCH, ctx->cr3_match);
> > +for ( i = 0; i < addr_num; i++ )
> > +wrmsrl(MSR_IA32_RTIT_ADDR0_A + i, ctx->addr[i]); }
> > +
> > +static inline void pt_save_msr(struct pt_ctx *ctx, u32 addr_num) {
> > +u32 i;
> > +rdmsrl(MSR_IA32_RTIT_STATUS, ctx->status);
> > +rdmsrl(MSR_IA32_RTIT_OUTPUT_BASE, ctx->output_base);
> > +rdmsrl(MSR_IA32_RTIT_OUTPUT_MASK, ctx->output_mask);
> > +rdmsrl(MSR_IA32_RTIT_CR3_MATCH, ctx->cr3_match);
> > +for ( i = 0; i < addr_num; i++ )
> > +rdmsrl(MSR_IA32_RTIT_ADDR0_A + i, ctx->addr[i]); }
> > +
> > +void pt_guest_enter(struct vcpu *v)
> 
> const
> 
> > +{
> > +struct pt_desc *pt = &v->arch.hvm_vmx.pt_desc;
> > +
> > +if ( pt->intel_pt_enabled &&
> > +   (pt->guest_pt_ctx.ctl & MSR_IA32_RTIT_CTL_TRACEEN) )
> > +pt_load_msr(&pt->guest_pt_ctx, pt->addr_num); }
> > +
> > +void pt_guest_exit(struct vcpu *v)
> > +{
> > +struct pt_desc *pt = &v->arch.hvm_vmx.pt_desc;
> > +
> > +if ( pt->intel_pt_enabled &&
> > +   (pt->guest_pt_ctx.ctl & MSR_IA32_RTIT_CTL_TRACEEN) )
> > +pt_save_msr(&pt->guest_pt_ctx, pt->addr_num); }
> > +
> > +void pt_vcpu_init(struct vcpu *v)
> > +{
> > +struct pt_desc *pt = &v->arch.hvm_vmx.pt_desc;
> > +unsigned int eax, ebx, ecx, edx;
> > +
> > +memset(pt, 0, sizeof(struct pt_desc));
> 
> As long as this is a sub-structure of struct vcpu, this is unnecessary.
> And once you switch to separate allocation, you should simply use xzalloc().

Get it. Will fix it.

> 
> > +pt->intel_pt_enabled = false;
> 
> This is redundant with the memset() just done.
> 
> > +if ( !cpu_has_intel_pt || !opt_intel_pt ||
> > + !(v->arch.hvm_vmx.secondary_exec_control &
> SECONDARY_EXEC_PT_USE_GPA) )
> > +return;
> > +
> > +/* get the number of address ranges */
> > +if ( cpuid_eax(0x14) == 1 )
> 
> Why would a max leaf above 1 not be okay?

Yes, you are right. Will change to:
if ( cpuid_eax(0x14) == 0 )
return;

> 
> > +cpuid_count(0x14, 1, &eax, &ebx, &ecx, &edx);
> > +else
> > +return;
> 
> Please invert the condition in the if() and use "return" first, rendering the
> "else" unnecessary.
> 
> > +pt->addr_num = eax & 0x7;
> > +pt->guest_pt_ctx.output_mask = 0x7F;
> 
> Bad literal numbers. In the latter case you want a #define, while in the
> former case you should be able to read the host CPUID policy (once you've
> properly defined fields for it).

Get it. Will fix it.

> 
> > +pt->intel_pt_enabled = true;
> > +
> > +vmx_vmcs_enter(v);
> > +__vmwrite(GUEST_IA32_RTIT_CTL, 0);
> > +vmx_vmcs_exit(v);
> 
> Wouldn't this better go into construct_vmcs(), especially if this isn't the
> default (in which case the early returns above would leave the field with a
> random value)?

Will move this into construct_vmcs(). I didn't found any case cause this with a 
random value. 
I just think about another case, If we create two guest and all can enabled 
intel PT. we may need to re-initialize the value of GUEST_IA32_RTIT_CTL if a 
physical logic CPU switch vcpu from one guest to another vcpu from a different 
guest.

> 
> > @@ -4281,6 +4284,7 @@ bool vmx_vmenter_helper(const struct
> cpu_user_regs *regs)
> >  }
> >  }
> >
> > +pt_guest_enter(curr);
> >   out:
> >  if ( unlikely(curr->arch.hvm_vmx.lbr_fixup_enabled) )
> >  lbr_fixup();
> 
> Doesn't your addition belong after the out label? Also note that the function
> has two early return paths, one of which likely would need this added, too.

Yes, will move this function into the out label.

> 
> > --- a/xen/include/asm-x86/hvm/vmx/vmcs.h
> > +++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
> > @@ -421,6 +421,8 @@ enum vmcs_field {
> >  GUEST_PDPTE0= 0x280a,
> >  #define GUEST_PDPTE(n) (GUEST_PDPTE0 + (n) * 2) /* n = 0...3 */
> >  GUEST_BNDCFGS   = 0x2812,
> > +GUEST_IA32_RTIT_CTL = 0x2814,
> > +GUEST_IA32_RTIT_CTL_HIGH= 0x2815,
> 
> Did you not notice that we don't have any *_HIGH enumerators (anymore)?

Oh, get it. I will remove it in next version.

Thanks,
Luwei Kang

> 
> Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/li

Re: [Xen-devel] [PATCH RESEND v1 2/7] x86: configure vmcs for Intel processor trace virtualization

2018-05-02 Thread Kang, Luwei
> >>> On 28.04.18 at 03:07,  wrote:
> >> > @@ -383,13 +388,28 @@ static int vmx_init_vmcs_config(void)
> >> >  _vmx_secondary_exec_control &=
> >> > ~SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;
> >> >
> >> >  min = 0;
> >> > -opt = VM_ENTRY_LOAD_GUEST_PAT | VM_ENTRY_LOAD_BNDCFGS;
> >> > +opt = VM_ENTRY_LOAD_GUEST_PAT | VM_ENTRY_LOAD_BNDCFGS |
> >> > +  VM_ENTRY_CONCEAL_PT_PIP | VM_ENTRY_LOAD_IA32_RTIT_CTL;
> >> >  _vmx_vmentry_control = adjust_vmx_controls(
> >> >  "VMEntry Control", min, opt, MSR_IA32_VMX_ENTRY_CTLS,
> >> > &mismatch);
> >> >
> >> >  if ( mismatch )
> >> >  return -EINVAL;
> >> >
> >> > +if ( !(_vmx_secondary_exec_control &
> >> > + SECONDARY_EXEC_ENABLE_EPT)
> >> ||
> >> > + !(_vmx_secondary_exec_control &
> >> > + SECONDARY_EXEC_PT_USE_GPA)
> >> ||
> >> > + !(_vmx_vmexit_control & VM_EXIT_CLEAR_IA32_RTIT_CTL) ||
> >> > + !(_vmx_vmentry_control & VM_ENTRY_LOAD_IA32_RTIT_CTL) )
> >> > +{
> >> > +_vmx_secondary_exec_control &=
> >> ~(SECONDARY_EXEC_PT_USE_GPA |
> >> > + SECONDARY_EXEC_CONCEAL_PT_PIP);
> >> > +_vmx_vmexit_control &= ~(VM_EXIT_CONCEAL_PT_PIP |
> >> > + VM_EXIT_CLEAR_IA32_RTIT_CTL);
> >> > +_vmx_vmentry_control &= ~(VM_ENTRY_CONCEAL_PT_PIP |
> >> > +  VM_ENTRY_LOAD_IA32_RTIT_CTL);
> >> > +opt_intel_pt = 0;
> >> > +}
> >>
> >> Besides clearing the flag here, shouldn't you also check it further up?
> >
> > If " opt_intel_pt =0" represent user don't want to use this feature to
> > all guest or hardware don't support it at all. If flag "opt_intel_pt "
> > still true after this check represent the user want to use this
> > feature and hardware have capability to support PT in guest.  This is
> > depend on hardware capability and the parameter set of xen command line 
> > "ipt=1".
> 
> I'm having some difficulty to follow this, so let me explain my point a little
> further: If (part of) the required features is available in hardware, but the 
> user opted to not use IPT, wouldn't it be better for
> consistency to turn off the individual IPT features (so that e.g. checks of 
> them elsewhere in the code won't go wrong), i.e. pretend
> the hardware doesn't support them?

If the hardware have the capability to enable IPT in guest but the user don't 
want to use it. We can set "intel_pt = 0" in XEN command line to disable this 
feature so that IPT will can't be detected in all guest. 

Thanks,
Luwei Kang

> 
> Jan
> 


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RESEND v1 0/7] Intel Processor Trace virtulization enabling

2018-05-02 Thread Kang, Luwei
> On Tue, Jan 16, 2018 at 02:12:26AM +0800, Luwei Kang wrote:
> > Hi All,
> >
> > Here is a patch-series which adding Processor Trace enabling in XEN guest. 
> > You can get It's software developer manuals from:
> > https://software.intel.com/sites/default/files/managed/c5/15/architect
> > ure-instruction-set-extensions-programming-reference.pdf
> > In Chapter 5 INTEL PROCESSOR TRACE: VMX IMPROVEMENTS.
> 
> Now Chapter 4.

Yes, will update this description. Thanks.

> >
> > Introduction:
> > Intel Processor Trace (Intel PT) is an extension of Intel Architecture that 
> > captures information about software execution using
> dedicated hardware facilities that cause only minimal performance 
> perturbation to the software being traced. Details on the Intel PT
> infrastructure and trace capabilities can be found in the Intel 64 and IA-32 
> Architectures Software Developer’s Manual, Volume 3C.
> >
> > The suite of architecture changes serve to simplify the process of 
> > virtualizing Intel PT for use by a guest software. There are two
> primary elements to this new architecture support for VMX support 
> improvements made for Intel PT.
> > 1. Addition of a new guest IA32_RTIT_CTL value field to the VMCS.
> >   — This serves to speed and simplify the process of disabling trace on VM 
> > exit, and restoring it on VM entry.
> > 2. Enabling use of EPT to redirect PT output.
> >   — This enables the VMM to elect to virtualize the PT output buffer using 
> > EPT. In this mode, the CPU will treat PT output
> addresses as Guest Physical Addresses (GPAs) and translate them using EPT. 
> This means that Intel PT output reads (of the ToPA
> table) and writes (of trace output) can cause EPT violations, and other 
> output events.
> >
> 
> How does one test this functionality in Linux? As in does 'perf' take 
> advantage of the Processor Trace functionality?

We can test this feature by "perf" tools in Linux. 

Thanks,
Luwei Kang
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RESEND v1 3/7] x86: add intel proecessor trace support for cpuid

2018-05-02 Thread Kang, Luwei
> > This patch add Intel processor trace support for cpuid handling.
> 
> The 0x14 usage screams of wanting an #define.

Get it. Will define leaf 0x14 as a macro in next version. Thanks for the review.

Luwei Kang

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RESEND v1 2/7] x86: configure vmcs for Intel processor trace virtualization

2018-05-02 Thread Kang, Luwei
> >> >>> On 28.04.18 at 03:07,  wrote:
> >> >> > @@ -383,13 +388,28 @@ static int vmx_init_vmcs_config(void)
> >> >> >  _vmx_secondary_exec_control &=
> >> >> > ~SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;
> >> >> >
> >> >> >  min = 0;
> >> >> > -opt = VM_ENTRY_LOAD_GUEST_PAT | VM_ENTRY_LOAD_BNDCFGS;
> >> >> > +opt = VM_ENTRY_LOAD_GUEST_PAT | VM_ENTRY_LOAD_BNDCFGS |
> >> >> > +  VM_ENTRY_CONCEAL_PT_PIP |
> >> >> > + VM_ENTRY_LOAD_IA32_RTIT_CTL;
> >> >> >  _vmx_vmentry_control = adjust_vmx_controls(
> >> >> >  "VMEntry Control", min, opt, MSR_IA32_VMX_ENTRY_CTLS,
> >> >> > &mismatch);
> >> >> >
> >> >> >  if ( mismatch )
> >> >> >  return -EINVAL;
> >> >> >
> >> >> > +if ( !(_vmx_secondary_exec_control &
> >> >> > + SECONDARY_EXEC_ENABLE_EPT)
> >> >> ||
> >> >> > + !(_vmx_secondary_exec_control &
> >> >> > + SECONDARY_EXEC_PT_USE_GPA)
> >> >> ||
> >> >> > + !(_vmx_vmexit_control & VM_EXIT_CLEAR_IA32_RTIT_CTL) ||
> >> >> > + !(_vmx_vmentry_control & VM_ENTRY_LOAD_IA32_RTIT_CTL) )
> >> >> > +{
> >> >> > +_vmx_secondary_exec_control &=
> >> >> ~(SECONDARY_EXEC_PT_USE_GPA |
> >> >> > + 
> >> >> > SECONDARY_EXEC_CONCEAL_PT_PIP);
> >> >> > +_vmx_vmexit_control &= ~(VM_EXIT_CONCEAL_PT_PIP |
> >> >> > + VM_EXIT_CLEAR_IA32_RTIT_CTL);
> >> >> > +_vmx_vmentry_control &= ~(VM_ENTRY_CONCEAL_PT_PIP |
> >> >> > +  VM_ENTRY_LOAD_IA32_RTIT_CTL);
> >> >> > +opt_intel_pt = 0;
> >> >> > +}
> >> >>
> >> >> Besides clearing the flag here, shouldn't you also check it further up?
> >> >
> >> > If " opt_intel_pt =0" represent user don't want to use this feature
> >> > to all guest or hardware don't support it at all. If flag "opt_intel_pt "
> >> > still true after this check represent the user want to use this
> >> > feature and hardware have capability to support PT in guest.  This
> >> > is depend on hardware capability and the parameter set of xen command 
> >> > line "ipt=1".
> >>
> >> I'm having some difficulty to follow this, so let me explain my point
> >> a little
> >> further: If (part of) the required features is available in hardware,
> >> but
> > the user opted to not use IPT, wouldn't it be better for
> >> consistency to turn off the individual IPT features (so that e.g.
> >> checks of
> > them elsewhere in the code won't go wrong), i.e. pretend
> >> the hardware doesn't support them?
> >
> > If the hardware have the capability to enable IPT in guest but the
> > user don't want to use it. We can set "intel_pt = 0" in XEN command
> > line to disable this feature so that IPT will can't be detected in all 
> > guest.
> 
> So we're moving in circles, it seems: Based on what you wrote, you appear to 
> agree to the abstract consideration. Yet my original
> question remains
> unanswered: Besides clearing the flag here, shouldn't you also check it 
> further up?

I think we should clear what we set in VMCS (i.e. SECONDARY_EXEC_PT_USE_GPA, 
VM_EXIT_CLEAR_IA32_RTIT_CTL and VM_ENTRY_LOAD_IA32_RTIT_CTL), don't expose 
Intel PT to guest and emulate a #GP when guest access Intel PT MSRs (i.e. 
MSR_IA32_RTIT_*) if hardware support intel PT in guest but "IPT=0".

Thanks,
Luwei Kang

> 
> Jan
> 


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RESEND v1 0/7] Intel Processor Trace virtulization enabling

2018-05-02 Thread Kang, Luwei
> > Here is a patch-series which adding Processor Trace enabling in XEN guest. 
> > You can get It's software developer manuals from:
> > https://software.intel.com/sites/default/files/managed/c5/15/architect
> > ure-instruction-set-extensions-programming-reference.pdf
> > In Chapter 5 INTEL PROCESSOR TRACE: VMX IMPROVEMENTS.
> >
> > Introduction:
> > Intel Processor Trace (Intel PT) is an extension of Intel Architecture that 
> > captures information about software execution using
> dedicated hardware facilities that cause only minimal performance 
> perturbation to the software being traced. Details on the Intel PT
> infrastructure and trace capabilities can be found in the Intel 64 and IA-32 
> Architectures Software Developer’s Manual, Volume 3C.
> >
> > The suite of architecture changes serve to simplify the process of 
> > virtualizing Intel PT for use by a guest software. There are two
> primary elements to this new architecture support for VMX support 
> improvements made for Intel PT.
> > 1. Addition of a new guest IA32_RTIT_CTL value field to the VMCS.
> >   — This serves to speed and simplify the process of disabling trace on VM 
> > exit, and restoring it on VM entry.
> > 2. Enabling use of EPT to redirect PT output.
> >   — This enables the VMM to elect to virtualize the PT output buffer using 
> > EPT. In this mode, the CPU will treat PT output
> addresses as Guest Physical Addresses (GPAs) and translate them using EPT. 
> This means that Intel PT output reads (of the ToPA
> table) and writes (of trace output) can cause EPT violations, and other 
> output events.
> >
> 
> A high level question, SDM vol 3 "Emulation of Intel PT Traced State"
> says:
> 
> "If a VMM emulates an element of processor state by taking a VM exit on reads 
> and/or writes to that piece of state, and the state
> element impacts Intel PT packet generation or values, it may be incumbent 
> upon the VMM to insert or modify the output trace
> data."
> 
> The immediately follows that paragraph is an example of CR3 causing vmexit 
> which leads to missing packet. IIRC Xen does that,
> however the code as is doesn't seem to handle that at all.

Yes, I need add some code on this. I propose if this can be handled by hardware 
but...

> 
> Another thing is Xen's vmevent allows intercepting several other traced 
> states. It seems that a more generic framework is needed to
> make PT work with vmevent subsystem? What is your thought on that?

Hi Wei,
I am not fully clear what is the "vmevent subsystem" and what is your mean 
of " several other traced states ". 
I guess vmevent is use VPMU collect performance counter? and save/load vpmu 
MSRs when it's scheduled?

Thanks,
Luwei Kang

> 
> Wei.
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RESEND v1 6/7] x86: Implement Intel Processor Trace MSRs read/write

2018-05-02 Thread Kang, Luwei
> > +int pt_do_wrmsr(unsigned int msr, uint64_t msr_content) {
> > +struct pt_desc *pt_desc = ¤t->arch.hvm_vmx.pt_desc;
> > +
> > +if ( !opt_intel_pt )
> > +return 1;
> > +
> > +switch ( msr ) {
> > +case MSR_IA32_RTIT_CTL:
> > +pt_set_rtit_ctl(pt_desc, msr_content);
> > +break;
> > +case MSR_IA32_RTIT_STATUS:
> > +pt_desc->guest_pt_ctx.status = msr_content;
> > +break;
> > +case MSR_IA32_RTIT_OUTPUT_BASE:
> > +pt_desc->guest_pt_ctx.output_base = msr_content;
> > +break;
> > +case MSR_IA32_RTIT_OUTPUT_MASK:
> > +pt_desc->guest_pt_ctx.output_mask = msr_content | 0x7F;
> > +break;
> > +case MSR_IA32_RTIT_CR3_MATCH:
> > +pt_desc->guest_pt_ctx.cr3_match = msr_content;
> > +break;
> > +default:
> > +pt_desc->guest_pt_ctx.addr[msr - MSR_IA32_RTIT_ADDR0_A] =
> > + msr_content;
> 
> At least these last ones need to have a canonical address check attached.

Get it. Will add address range number check and "goto gp_fault" if access 
unsupported MSRs.

> 
> And there is one more thing I've not found throughout the series: EPT 
> violations and a few other VM exits have gained a new
> qualification bit, indicating that it's not the current instruction which has 
> caused the exit.

Hi Jan,
I don't quite understand here about EPT violations and other VM exit 
qualification bit. There may have an EPT violations when guest record trace to 
ToPA. Is this what is your concern? About new vm-exit qualification bit, do you 
mean there have new qualification bit for Intel PT?

> I can't imagine this to not require any change to the handling of such exits 
> - in particular, such exits must never be handled by
> invoking the insn emulator. Aiui the only handling options here are to 
> eliminate the condition causing the exit, or to crash the guest.
> There's no way to emulate the intended access.

Emulate which instructions? Can you give me an example?

Thanks,
Luwei Kang

> 
> Yet another apparently missing piece appears to be the corresponding XSAVE 
> handling.
> 
> Jan
> 


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RESEND v1 6/7] x86: Implement Intel Processor Trace MSRs read/write

2018-05-03 Thread Kang, Luwei
> >>> On 03.05.18 at 07:22,  wrote:
> >> And there is one more thing I've not found throughout the series: EPT
> > violations and a few other VM exits have gained a new
> >> qualification bit, indicating that it's not the current instruction
> >> which
> > has caused the exit.
> >
> > I don't quite understand here about EPT violations and other VM
> > exit qualification bit. There may have an EPT violations when guest
> > record trace to ToPA. Is this what is your concern? About new vm-exit
> > qualification bit, do you mean there have new qualification bit for Intel 
> > PT?
> 
> Quoting the respective doc:
> 
> "4.2.2.1 VM Exits Due to Intel PT Output
> 
>  Treating PT output addresses as guest-physical addresses introduces the  
> possibility of taking events on PT output reads and writes.
> Event possibilities  include EPT violations, EPT misconfigurations, PML 
> log-full VM exits, and APIC  access VM exits.
> 
>  Exit Qualification
> 
>  Intel PT output reads and writes are asynchronous to instruction execution,  
> as a result of the internal buffering of trace data. Trace
> packets are output  some unpredictable number of cycles after the completion 
> of the instructions  or events that generated them.
> For this reason, any VM exit caused by Intel  PT output will set the 
> following new exit qualification bit."

Hi Jan,
 Thanks for your clarification. Please correct me if I have something 
wrong. Guest may execute an instruction and this instruction may produce an PT 
packet save in PT output buffer. An EPT violation will be generated if the 
address of this PT buffer don't have EPT page table mapping, but this EPT 
violations shouldn't be handled by x86_emulate() because it no relate with the 
execute of this instruction.

 In that case, can we build the EPT map when set the output buffer address 
(IA32_RTIT_OUTPUT_BASE) and crash the guest if still happened EPT violation 
with Intel PT output buffer read/write exit qualification. Or add an exit 
qualification check before instruction emulation?

Thanks,
Luwei Kang

> 
> >> I can't imagine this to not require any change to the handling of
> >> such exits
> > - in particular, such exits must never be handled by
> >> invoking the insn emulator. Aiui the only handling options here are
> >> to
> > eliminate the condition causing the exit, or to crash the guest.
> >> There's no way to emulate the intended access.
> >
> > Emulate which instructions? Can you give me an example?
> 
> No instructions, as I've said (and hence no example). My point is you need to 
> make sure we don't _ever_ try to emulate the
> instruction at which guest state points when this is an EPT violation (or 
> misconfiguration) caused by Intel PT.
> 
> Jan
> 


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RESEND v1 0/7] Intel Processor Trace virtulization enabling

2018-05-03 Thread Kang, Luwei
> > Here is a patch-series which adding Processor Trace enabling in XEN guest. 
> > You can get It's software developer manuals from:
> > https://software.intel.com/sites/default/files/managed/c5/15/architect
> > ure-instruction-set-extensions-programming-reference.pdf
> > In Chapter 5 INTEL PROCESSOR TRACE: VMX IMPROVEMENTS.
> >
> > Introduction:
> > Intel Processor Trace (Intel PT) is an extension of Intel Architecture that 
> > captures information about software execution using
> dedicated hardware facilities that cause only minimal performance 
> perturbation to the software being traced. Details on the Intel PT
> infrastructure and trace capabilities can be found in the Intel 64 and IA-32 
> Architectures Software Developer’s Manual, Volume 3C.
> >
> > The suite of architecture changes serve to simplify the process of 
> > virtualizing Intel PT for use by a guest software. There are two
> primary elements to this new architecture support for VMX support 
> improvements made for Intel PT.
> > 1. Addition of a new guest IA32_RTIT_CTL value field to the VMCS.
> >   — This serves to speed and simplify the process of disabling trace on VM 
> > exit, and restoring it on VM entry.
> > 2. Enabling use of EPT to redirect PT output.
> >   — This enables the VMM to elect to virtualize the PT output buffer using 
> > EPT. In this mode, the CPU will treat PT output
> addresses as Guest Physical Addresses (GPAs) and translate them using EPT. 
> This means that Intel PT output reads (of the ToPA
> table) and writes (of trace output) can cause EPT violations, and other 
> output events.
> >
> 
> A high level question, SDM vol 3 "Emulation of Intel PT Traced State"
> says:
> 
> "If a VMM emulates an element of processor state by taking a VM exit on reads 
> and/or writes to that piece of state, and the state
> element impacts Intel PT packet generation or values, it may be incumbent 
> upon the VMM to insert or modify the output trace
> data."
> 
> The immediately follows that paragraph is an example of CR3 causing vmexit 
> which leads to missing packet. IIRC Xen does that,
> however the code as is doesn't seem to handle that at all.

Hi Wei,
Intel PT can be exposed to guest only when EPT is enabled. In that case, 
CPU_BASED_CR3_LOAD_EXITING and CPU_BASED_CR3_STORE_EXITING would be clear, so 
"MOV CR3 " will not cause a vm-exit. It looks like don't need emulate the 
missing PIP by writing it into the guest output buffer.

Thanks,
Luwei Kang

> 
> Another thing is Xen's vmevent allows intercepting several other traced 
> states. It seems that a more generic framework is needed to
> make PT work with vmevent subsystem? What is your thought on that?
> 
> Wei.
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RESEND v1 0/7] Intel Processor Trace virtulization enabling

2018-05-03 Thread Kang, Luwei
> >>> Here is a patch-series which adding Processor Trace enabling in XEN 
> >>> guest. You can get It's software developer manuals from:
> >>> https://software.intel.com/sites/default/files/managed/c5/15/archite
> >>> ct ure-instruction-set-extensions-programming-reference.pdf
> >>> In Chapter 5 INTEL PROCESSOR TRACE: VMX IMPROVEMENTS.
> >>>
> >>> Introduction:
> >>> Intel Processor Trace (Intel PT) is an extension of Intel
> >>> Architecture that captures information about software execution
> >>> using
> >> dedicated hardware facilities that cause only minimal performance
> >> perturbation to the software being traced. Details on the Intel PT 
> >> infrastructure and trace capabilities can be found in the Intel 64
> and IA-32 Architectures Software Developer’s Manual, Volume 3C.
> >>> The suite of architecture changes serve to simplify the process of
> >>> virtualizing Intel PT for use by a guest software. There are two
> >> primary elements to this new architecture support for VMX support 
> >> improvements made for Intel PT.
> >>> 1. Addition of a new guest IA32_RTIT_CTL value field to the VMCS.
> >>>   — This serves to speed and simplify the process of disabling trace on 
> >>> VM exit, and restoring it on VM entry.
> >>> 2. Enabling use of EPT to redirect PT output.
> >>>   — This enables the VMM to elect to virtualize the PT output buffer
> >>> using EPT. In this mode, the CPU will treat PT output
> >> addresses as Guest Physical Addresses (GPAs) and translate them using
> >> EPT. This means that Intel PT output reads (of the ToPA
> >> table) and writes (of trace output) can cause EPT violations, and other 
> >> output events.
> >> A high level question, SDM vol 3 "Emulation of Intel PT Traced State"
> >> says:
> >>
> >> "If a VMM emulates an element of processor state by taking a VM exit
> >> on reads and/or writes to that piece of state, and the state element
> >> impacts Intel PT packet generation or values, it may be incumbent upon the 
> >> VMM to insert or modify the output trace data."
> >>
> >> The immediately follows that paragraph is an example of CR3 causing
> >> vmexit which leads to missing packet. IIRC Xen does that, however the code 
> >> as is doesn't seem to handle that at all.
> > Hi Wei,
> > Intel PT can be exposed to guest only when EPT is enabled. In that 
> > case, CPU_BASED_CR3_LOAD_EXITING and
> CPU_BASED_CR3_STORE_EXITING would be clear, so "MOV CR3 " will not cause a 
> vm-exit. It looks like don't need emulate the
> missing PIP by writing it into the guest output buffer.
> 
> With introspection, the guest mov to cr3 instruction might be on a page 
> protected with NX at the EPT level, at which point it traps
> for inspection and will be completed with emulation, to avoid the overhead of 
> changing EPT permissions, singlestepping the guest,
> then reinstating the NX protection.
> 
> Basically, any and all actions could end up requiring emulation, based on the 
> safety decisions of the introspection logic.

Thanks for your clarification.  I will emulate the missing PIP packet in 
vmx_vmexit_handler() -> "case EXIT_REASON_CR_ACCESS"

Thanks,
Luwei Kang
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RESEND v1 6/7] x86: Implement Intel Processor Trace MSRs read/write

2018-05-03 Thread Kang, Luwei
> >  Thanks for your clarification. Please correct me if I have
> > something wrong. Guest may execute an instruction and this instruction
> > may produce an PT packet save in PT output buffer. An EPT violation
> > will be generated if the address of this PT buffer don't have EPT page
> > table mapping, but this EPT violations shouldn't be handled by
> > x86_emulate() because it no relate with the execute of this instruction.
> 
> Plus - and that's very important - the EPT violation may be reported on some 
> later insn.

Hi Jan,
You mean the "later instruction" is some new instruction in future hardware?  
Is there have something need we attention or any different with current 
instruction which can lead EPT violation? Sorry, I didn't catch your concern. 

> 
> >  In that case, can we build the EPT map when set the output buffer
> > address (IA32_RTIT_OUTPUT_BASE) and crash the guest if still happened
> > EPT violation with Intel PT output buffer read/write exit
> > qualification. Or add an exit qualification check before instruction 
> > emulation?
> 
> Imo you should add an exit qualification check in any case. Depending what 
> else you do, finding the new bit set may imply crashing
> the domain or doing something more sophisticated.

Do you mean add this check at the beginning of any specific "exit_reason" 
handler in vmx_vmexit_handler() function?
Another question is where to build this EPT mapping? Setting 
IA32_RTIT_OUTPUT_BASE or handled by EPT violation?

Thanks,
Luwei Kang

> 
> Jan
> 


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RESEND v1 0/7] Intel Processor Trace virtulization enabling

2018-05-03 Thread Kang, Luwei
> > > > Here is a patch-series which adding Processor Trace enabling in XEN 
> > > > guest. You can get It's software developer manuals from:
> > > > https://software.intel.com/sites/default/files/managed/c5/15/archi
> > > > tect ure-instruction-set-extensions-programming-reference.pdf
> > > > In Chapter 5 INTEL PROCESSOR TRACE: VMX IMPROVEMENTS.
> > > >
> > > > Introduction:
> > > > Intel Processor Trace (Intel PT) is an extension of Intel
> > > > Architecture that captures information about software execution
> > > > using
> > > dedicated hardware facilities that cause only minimal performance
> > > perturbation to the software being traced. Details on the Intel PT 
> > > infrastructure and trace capabilities can be found in the Intel
> 64 and IA-32 Architectures Software Developer’s Manual, Volume 3C.
> > > >
> > > > The suite of architecture changes serve to simplify the process of
> > > > virtualizing Intel PT for use by a guest software. There are two
> > > primary elements to this new architecture support for VMX support 
> > > improvements made for Intel PT.
> > > > 1. Addition of a new guest IA32_RTIT_CTL value field to the VMCS.
> > > >   — This serves to speed and simplify the process of disabling trace on 
> > > > VM exit, and restoring it on VM entry.
> > > > 2. Enabling use of EPT to redirect PT output.
> > > >   — This enables the VMM to elect to virtualize the PT output
> > > > buffer using EPT. In this mode, the CPU will treat PT output
> > > addresses as Guest Physical Addresses (GPAs) and translate them
> > > using EPT. This means that Intel PT output reads (of the ToPA
> > > table) and writes (of trace output) can cause EPT violations, and other 
> > > output events.
> > > >
> > >
> > > A high level question, SDM vol 3 "Emulation of Intel PT Traced State"
> > > says:
> > >
> > > "If a VMM emulates an element of processor state by taking a VM exit
> > > on reads and/or writes to that piece of state, and the state element
> > > impacts Intel PT packet generation or values, it may be incumbent upon 
> > > the VMM to insert or modify the output trace data."
> > >
> > > The immediately follows that paragraph is an example of CR3 causing
> > > vmexit which leads to missing packet. IIRC Xen does that, however the 
> > > code as is doesn't seem to handle that at all.
> >
> > Yes, I need add some code on this. I propose if this can be handled by 
> > hardware but...
> >
> > >
> > > Another thing is Xen's vmevent allows intercepting several other
> > > traced states. It seems that a more generic framework is needed to make 
> > > PT work with vmevent subsystem? What is your
> thought on that?
> >
> > Hi Wei,
> > I am not fully clear what is the "vmevent subsystem" and what is your 
> > mean of " several other traced states ".
> > I guess vmevent is use VPMU collect performance counter? and save/load 
> > vpmu MSRs when it's scheduled?
> 
> See Razvan's reply.
> 
> I suppose your first step would be to make Xen able to insert new records to 
> guest's trace buffer. The end result would be a set of
> functions to do that. We would need that even without consideration of 
> vmevent because Xen can choose to intercept any of the
> traced state as it evolves.
> 
> Then we can see about how to hook that up to vmevent subsystem -- at this 
> point I think it will become a specialised case of what
> Xen already does.
> 

I briefly go through the source code of vm_event and find some function like 
hvm_monitor_msr(), hvm_monitor_cpuid(), hvm_monitor_cr(). I guess cpuid and msr 
event can work with current source code and don't need do anything.
The only things I can think of is as your mentioned " insert new records to 
guest's trace buffer ". But vmm may can't catch this event because it write by 
hardware. Or we trap the buffer write operation by making the trace buffer 
write protect or misconfiguration? 

Thanks,
Luwei Kang

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RESEND v1 6/7] x86: Implement Intel Processor Trace MSRs read/write

2018-05-10 Thread Kang, Luwei
> >>> On 04.05.18 at 05:53,  wrote:
> >> >  Thanks for your clarification. Please correct me if I have
> >> > something wrong. Guest may execute an instruction and this
> >> > instruction may produce an PT packet save in PT output buffer. An
> >> > EPT violation will be generated if the address of this PT buffer
> >> > don't have EPT page table mapping, but this EPT violations
> >> > shouldn't be handled by
> >> > x86_emulate() because it no relate with the execute of this instruction.
> >>
> >> Plus - and that's very important - the EPT violation may be reported
> >> on some
> > later insn.
> >
> > You mean the "later instruction" is some new instruction in future hardware?
> 
> No, I mean an instruction following later in the instruction stream.
> 
> >> >  In that case, can we build the EPT map when set the output
> >> > buffer address (IA32_RTIT_OUTPUT_BASE) and crash the guest if still
> >> > happened EPT violation with Intel PT output buffer read/write exit
> >> > qualification. Or add an exit qualification check before instruction 
> >> > emulation?
> >>
> >> Imo you should add an exit qualification check in any case. Depending
> >> what
> > else you do, finding the new bit set may imply crashing
> >> the domain or doing something more sophisticated.
> >
> > Do you mean add this check at the beginning of any specific "exit_reason"
> > handler in vmx_vmexit_handler() function?
> 
> That depends. Surely not for every exit reason, but only those for which this 
> new bit is valid (iirc exit qualifications differ per exit
> reason anyway, so you can't unilaterally check the same bit everywhere). And 
> even for those where the bit is valid, I'm not sure you
> can decide in the exit handler alone what to do if the bit is set. It may be 
> necessary to propagate the flag down the call tree.
> 
> > Another question is where to build this EPT mapping? Setting
> > IA32_RTIT_OUTPUT_BASE or handled by EPT violation?
> 
> I have no idea - that's more a question for you to answer yourself.
> 

I make a draft patch for VM exit caused by Intel PT output (comments as 
annotation). It looks have little thing to deal with.
There have four case which may cause VM-exit by PT output (spec 5.2.2.1).
1. EPT violations. This because there don't have EPT mapping from GPA to HPA, I 
think this can be handled as usual. About MMIO, I think maybe we can make guest 
OS crash.
2. EPT misconfigurations. We may misconfigure EPT table for log the dirty page 
and MMIO access (Please correct me if have something wrong). I can't find there 
need to have some special need to be handled.
3. PML log-full. PT output may cause vm-exit for PML  log-full when trace 
record to a new page. But it looks don't need additional handle as well.
4. APIC access. Currently, I have no idea about how this relate with PT buffer 
write. When PT buffer is full, an PMI interrupt would be injected to this VM, 
but still have no direct relationship.

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index c23983c..fbf272e 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1873,6 +1873,18 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long 
gla,
  (npfec.write_access &&
   (p2m_is_discard_write(p2mt) || (p2mt == p2m_ioreq_server))) )
 {
+/*
+ * Don't need emulate and make guest crash when write to
+ * mmio address or a ram address without write permission.
+ * In fact, output buffer can set to be MMIO address (35.2.6.1),
+ * it need the support of a hardware PCI card which use for
+ * collect trace information. I am afraid it initialized to
+ * a valid general MMIO address which is used by a pass through
+ * device.
+ */
+if ( npfec.pt_output )
+goto out_put_gfn;
+
 if ( !handle_mmio_with_translation(gla, gpa >> PAGE_SHIFT, npfec) )
 hvm_inject_hw_exception(TRAP_gp_fault, 0);
 rc = 1;
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 021cf33..ba4f979 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -3249,6 +3249,7 @@ static void ept_handle_violation(ept_qual_t q, paddr_t 
gpa)
 .write_access = q.write,
 .insn_fetch = q.fetch,
 .present = q.eff_read || q.eff_write || q.eff_exec,
+.pt_output = q.pt_output,
 };

 if ( tb_init_done )
diff --git a/xen/include/asm-x86/hvm/vmx/vmx.h 
b/xen/include/asm-x86/hvm/vmx/vmx.h
index 89619e4..70b6c5f 100644
--- a/xen/include/asm-x86/hvm/vmx/vmx.h
+++ b/xen/include/asm-x86/hvm/vmx/vmx.h
@@ -620,11 +620,13 @@ void vmx_pi_hooks_deassign(struct domain *d);
 typedef union ept_qual {
 unsigned long raw;
 struct {
-bool read:1, write:1, fetch:1,
+unsigned long read:1, write:1, fetch:1,
 eff_read:1, eff_write:1, eff_exec:1, /* eff_user_exec */:1,
 gla_valid:1,
-gla_fault:1; /* Valid iff gla_valid. */
-   

Re: [Xen-devel] [PATCH RESEND v1 0/7] Intel Processor Trace virtulization enabling

2018-05-10 Thread Kang, Luwei
> >>> Here is a patch-series which adding Processor Trace enabling in XEN 
> >>> guest. You can get It's software developer manuals from:
> >>> https://software.intel.com/sites/default/files/managed/c5/15/archite
> >>> ct ure-instruction-set-extensions-programming-reference.pdf
> >>> In Chapter 5 INTEL PROCESSOR TRACE: VMX IMPROVEMENTS.
> >>>
> >>> Introduction:
> >>> Intel Processor Trace (Intel PT) is an extension of Intel
> >>> Architecture that captures information about software execution
> >>> using
> >> dedicated hardware facilities that cause only minimal performance
> >> perturbation to the software being traced. Details on the Intel PT 
> >> infrastructure and trace capabilities can be found in the Intel 64
> and IA-32 Architectures Software Developer’s Manual, Volume 3C.
> >>> The suite of architecture changes serve to simplify the process of
> >>> virtualizing Intel PT for use by a guest software. There are two
> >> primary elements to this new architecture support for VMX support 
> >> improvements made for Intel PT.
> >>> 1. Addition of a new guest IA32_RTIT_CTL value field to the VMCS.
> >>>   — This serves to speed and simplify the process of disabling trace on 
> >>> VM exit, and restoring it on VM entry.
> >>> 2. Enabling use of EPT to redirect PT output.
> >>>   — This enables the VMM to elect to virtualize the PT output buffer
> >>> using EPT. In this mode, the CPU will treat PT output
> >> addresses as Guest Physical Addresses (GPAs) and translate them using
> >> EPT. This means that Intel PT output reads (of the ToPA
> >> table) and writes (of trace output) can cause EPT violations, and other 
> >> output events.
> >> A high level question, SDM vol 3 "Emulation of Intel PT Traced State"
> >> says:
> >>
> >> "If a VMM emulates an element of processor state by taking a VM exit
> >> on reads and/or writes to that piece of state, and the state element
> >> impacts Intel PT packet generation or values, it may be incumbent upon the 
> >> VMM to insert or modify the output trace data."
> >>
> >> The immediately follows that paragraph is an example of CR3 causing
> >> vmexit which leads to missing packet. IIRC Xen does that, however the code 
> >> as is doesn't seem to handle that at all.
> > Hi Wei,
> > Intel PT can be exposed to guest only when EPT is enabled. In that 
> > case, CPU_BASED_CR3_LOAD_EXITING and
> CPU_BASED_CR3_STORE_EXITING would be clear, so "MOV CR3 " will not cause a 
> vm-exit. It looks like don't need emulate the
> missing PIP by writing it into the guest output buffer.
> 
> With introspection, the guest mov to cr3 instruction might be on a page 
> protected with NX at the EPT level, at which point it traps
> for inspection and will be completed with emulation, to avoid the overhead of 
> changing EPT permissions, singlestepping the guest,
> then reinstating the NX protection.
> 
> Basically, any and all actions could end up requiring emulation, based on the 
> safety decisions of the introspection logic.

Hi Andrew, 
 As you mentioned in previous mail and emphasized in community call. Any 
instruction might be on a page protected with NX at the EPT level. So it looks 
like that almost all the Trace packet need to be emulated. For example, 
TNT(taken/not-taken) might be emulate for branch instruction, TIP(target IP) 
might be emulate for branch, interrupt, exception and so on. Is that right?

Thanks,
Luwei Kang

> 
> ~Andrew
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RESEND v1 0/7] Intel Processor Trace virtulization enabling

2018-05-14 Thread Kang, Luwei
> > Here is a patch-series which adding Processor Trace enabling in XEN 
> > guest. You can get It's software developer manuals from:
> > https://software.intel.com/sites/default/files/managed/c5/15/archi
> > te ct ure-instruction-set-extensions-programming-reference.pdf
> > In Chapter 5 INTEL PROCESSOR TRACE: VMX IMPROVEMENTS.
> >
> > Introduction:
> > Intel Processor Trace (Intel PT) is an extension of Intel
> > Architecture that captures information about software execution
> > using
>  dedicated hardware facilities that cause only minimal performance
>  perturbation to the software being traced. Details on the Intel PT
>  infrastructure and trace capabilities can be found in the Intel 64
> >> and IA-32 Architectures Software Developer’s Manual, Volume 3C.
> > The suite of architecture changes serve to simplify the process of
> > virtualizing Intel PT for use by a guest software. There are two
>  primary elements to this new architecture support for VMX support 
>  improvements made for Intel PT.
> > 1. Addition of a new guest IA32_RTIT_CTL value field to the VMCS.
> >   — This serves to speed and simplify the process of disabling trace on 
> > VM exit, and restoring it on VM entry.
> > 2. Enabling use of EPT to redirect PT output.
> >   — This enables the VMM to elect to virtualize the PT output
> > buffer using EPT. In this mode, the CPU will treat PT output
>  addresses as Guest Physical Addresses (GPAs) and translate them
>  using EPT. This means that Intel PT output reads (of the ToPA
>  table) and writes (of trace output) can cause EPT violations, and other 
>  output events.
>  A high level question, SDM vol 3 "Emulation of Intel PT Traced State"
>  says:
> 
>  "If a VMM emulates an element of processor state by taking a VM
>  exit on reads and/or writes to that piece of state, and the state
>  element impacts Intel PT packet generation or values, it may be 
>  incumbent upon the VMM to insert or modify the output
> trace data."
> 
>  The immediately follows that paragraph is an example of CR3 causing
>  vmexit which leads to missing packet. IIRC Xen does that, however the 
>  code as is doesn't seem to handle that at all.
> >>> Hi Wei,
> >>> Intel PT can be exposed to guest only when EPT is enabled. In
> >>> that case, CPU_BASED_CR3_LOAD_EXITING and
> >> CPU_BASED_CR3_STORE_EXITING would be clear, so "MOV CR3 " will not
> >> cause a vm-exit. It looks like don't need emulate the missing PIP by 
> >> writing it into the guest output buffer.
> >>
> >> With introspection, the guest mov to cr3 instruction might be on a
> >> page protected with NX at the EPT level, at which point it traps for
> >> inspection and will be completed with emulation, to avoid the overhead of 
> >> changing EPT permissions, singlestepping the guest,
> then reinstating the NX protection.
> >>
> >> Basically, any and all actions could end up requiring emulation, based on 
> >> the safety decisions of the introspection logic.
> > Hi Andrew,
> >  As you mentioned in previous mail and emphasized in community call. 
> > Any instruction might be on a page protected with NX
> at the EPT level. So it looks like that almost all the Trace packet need to 
> be emulated. For example, TNT(taken/not-taken) might be
> emulate for branch instruction, TIP(target IP) might be emulate for branch, 
> interrupt, exception and so on. Is that right?
> 
> Yes.  Then again, this information is readily available from the emulator.  
> What we probably need (although I've not put much
> thought into this) is to accumulate a list of trace events during emulation, 
> then insert them into the trace log only when we retire
> the instruction.

Hi,
I think this may take some time to add a function list to emulate (build) 
all type of packets. Furthermore, we need to consider all the condition of one 
packet generate scenario and get the information which needed by these packets. 
It looks like we need to emulate all the behaviors of what hardware to do.
Can we move this task (about introspection) to next stage? Or mask off 
Intel PT when use introspection?

Thanks,
Luwei Kang

> 
> ~Andrew
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RESEND v1 1/7] x86: add a flag to enable Intel processor trace

2018-03-12 Thread Kang, Luwei
> > diff --git a/docs/misc/xen-command-line.markdown
> > b/docs/misc/xen-command-line.markdown
> > index 781110d..95411cf 100644
> > --- a/docs/misc/xen-command-line.markdown
> > +++ b/docs/misc/xen-command-line.markdown
> > @@ -1009,6 +1009,13 @@ debug hypervisor only).
> >  ### idle\_latency\_factor
> >  > `= `
> >
> > +### intel\_pt
> > +> `= `
> > +
> > +> Default: `true`
> > +
> > +Flag to enable Intel Processor Trace.
> > +
> >  ### ioapic\_ack
> >  > `= old | new`
> >
> 
> No document for this option?

Thanks for the code review. Will add a simple description here.

> 
> > diff --git a/xen/arch/x86/cpu/Makefile b/xen/arch/x86/cpu/Makefile
> > index 74f23ae..33d7a74 100644
> > --- a/xen/arch/x86/cpu/Makefile
> > +++ b/xen/arch/x86/cpu/Makefile
> > @@ -8,3 +8,4 @@ obj-y += intel.o
> >  obj-y += intel_cacheinfo.o
> >  obj-y += mwait-idle.o
> >  obj-y += vpmu.o vpmu_amd.o vpmu_intel.o
> > +obj-y += intel_pt.o
> 
> Move this after intel_cacheinfo please.
> 
> We order things alphabetically.

Will check and reorder all the things in next version.

> > +
> > +/* intel_pt: Flag to enable Intel Processor Trace (default on). */
> > +bool_t __read_mostly opt_intel_pt = 1;
> 
> Use plain bool and true please.

Will fix it. BTW, I am thinking of make it "false" as default to reduce 
overhead.

Thanks,
Luwei Kang

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 06/10] x86: Introduce a new function to get capability of Intel PT

2018-07-03 Thread Kang, Luwei
> > --- a/xen/arch/x86/cpu/ipt.c
> > +++ b/xen/arch/x86/cpu/ipt.c
> > @@ -25,11 +25,74 @@
> >  #include 
> >  #include 
> >
> > +#define EAX 0
> > +#define ECX 1
> > +#define EDX 2
> > +#define EBX 3
> > +#define CPUID_REGS_NUM   4 /* number of regsters (eax, ebx, ecx, edx) */
> > +
> > +#define BIT(nr) (1UL << (nr))
> 
> I don't particularly like any such pretty generic things to be added to 
> individual files, but I also have nothing better to suggest. But
> please add the missing i to the comment.
> 
> > +#define IPT_CAP(_n, _l, _r, _m)   \
> > +[IPT_CAP_ ## _n] = { .name = __stringify(_n), .leaf = _l, \
> > +.reg = _r, .mask = _m }
> > +
> > +static struct ipt_cap_desc {
> > +const char*name;
> > +unsigned int  leaf;
> > +unsigned char reg;
> 
> I don't think leaf needs to be full 32 bits wide? Once shrunk by at least two 
> bits, the size of the overall structure could go down
> from 24 to 16 bytes.

OK, will change it from " unsigned int  " to "unsinged char".

> 
> > +unsigned int  mask;
> > +} ipt_caps[] = {
> > +IPT_CAP(max_subleaf,0, EAX, 0x),
> > +IPT_CAP(cr3_filter, 0, EBX, BIT(0)),
> > +IPT_CAP(psb_cyc,0, EBX, BIT(1)),
> > +IPT_CAP(ip_filter,  0, EBX, BIT(2)),
> > +IPT_CAP(mtc,0, EBX, BIT(3)),
> > +IPT_CAP(ptwrite,0, EBX, BIT(4)),
> > +IPT_CAP(power_event,0, EBX, BIT(5)),
> > +IPT_CAP(topa_output,0, ECX, BIT(0)),
> > +IPT_CAP(topa_multi_entry,   0, ECX, BIT(1)),
> > +IPT_CAP(single_range_output,0, ECX, BIT(2)),
> > +IPT_CAP(output_subsys,  0, ECX, BIT(3)),
> > +IPT_CAP(payloads_lip,   0, ECX, BIT(31)),
> > +IPT_CAP(addr_range, 1, EAX, 0x7),
> > +IPT_CAP(mtc_period, 1, EAX, 0x),
> > +IPT_CAP(cycle_threshold,1, EBX, 0x),
> > +IPT_CAP(psb_freq,   1, EBX, 0x),
> > +};
> 
> const?

Oh, will add it.

> 
> > +static unsigned int ipt_cap(const struct cpuid_leaf *cpuid_ipt, enum
> > +ipt_cap cap) {
> > +const struct ipt_cap_desc *cd = &ipt_caps[cap];
> > +unsigned int shift = ffs(cd->mask) - 1;
> 
> Do you really need this?
> 
> > +unsigned int val = 0;
> > +
> > +cpuid_ipt += cd->leaf;
> > +
> > +switch ( cd->reg )
> > +{
> > +case EAX:
> > +val = cpuid_ipt->a;
> > +break;
> > +case EBX:
> > +val = cpuid_ipt->b;
> > +break;
> > +case ECX:
> > +val = cpuid_ipt->c;
> > +break;
> > +case EDX:
> > +val = cpuid_ipt->d;
> > +break;
> > +}
> > +
> > +return (val & cd->mask) >> shift;
> 
> If all masks are indeed contiguous series of set bits, MASK_EXTR() can be 
> used here afaict.

Yes, it is a good define to me. Will fix it.

> 
> > +}
> > +
> >  static int __init parse_ipt_params(const char *str)  {
> >  if ( !strcmp("guest", str) )
> 
> So this is the end of the changes to this file, and the function you 
> introduce is static. I'm pretty sure compilers will warn about the
> unused static, and hence the build will fail at this point of the series (due 
> to -Werror). I think you want to introduce the function
> together with its first user.

I can't reproduce this issue by:
#./configure
# make build-xen (-Werror has been included during build)
Could you tell me how to?

Thanks,
Luwei Kang


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 03/10] x86: Add Intel Processor Trace support for cpuid

2018-07-03 Thread Kang, Luwei
> > --- a/xen/include/public/arch-x86/cpufeatureset.h
> > +++ b/xen/include/public/arch-x86/cpufeatureset.h
> > @@ -215,6 +215,7 @@ XEN_CPUFEATURE(SMAP,  5*32+20) /*S  Supervisor 
> > Mode Access Prevention */
> >  XEN_CPUFEATURE(AVX512IFMA,5*32+21) /*A  AVX-512 Integer Fused Multiply 
> > Add */
> >  XEN_CPUFEATURE(CLFLUSHOPT,5*32+23) /*A  CLFLUSHOPT instruction */
> >  XEN_CPUFEATURE(CLWB,  5*32+24) /*A  CLWB instruction */
> > +XEN_CPUFEATURE(IPT,   5*32+25) /*H  Intel Processor Trace */
> 
> Btw - introducing the feature flag here is certainly fine, but I think you 
> should add the H annotation only once functionality is
> complete. That would then e.g. also reduce the impact of patch 8 adding 
> functionality otherwise strictly necessary already in patch
> 7.

Sorry, not full understand here. Do you mean I need expose this feature to 
guest after all is ready? That is move this patch after patch 7,8 ?

Thanks,
Luwei Kang

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 07/10] x86: Add Intel Processor Trace MSRs read/write emulation

2018-07-03 Thread Kang, Luwei
> > --- a/xen/arch/x86/hvm/vmx/vmx.c
> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> > @@ -2898,6 +2898,15 @@ static int vmx_msr_read_intercept(unsigned int
> > msr, uint64_t *msr_content)
> >  if ( vpmu_do_rdmsr(msr, msr_content) )
> >  goto gp_fault;
> >  break;
> > +case MSR_IA32_RTIT_CTL:
> > +case MSR_IA32_RTIT_STATUS:
> > +case MSR_IA32_RTIT_OUTPUT_BASE:
> > +case MSR_IA32_RTIT_OUTPUT_MASK:
> > +case MSR_IA32_RTIT_CR3_MATCH:
> > +case MSR_IA32_RTIT_ADDR_A(0) ... MSR_IA32_RTIT_ADDR_B(3):
> 
> Is the 3 here an architectural limit? Otherwise you want to use a higher 
> number here and rely on the callee to do the full checking.
> 

"3" is the max number of address ranges which I can find in spec. The number of 
address ranges is get from CPUID info. Will fix it.

Thanks,
Luwei Kang

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 05/10] x86: Implement Intel Processor Trace context switch

2018-07-03 Thread Kang, Luwei
> > @@ -40,3 +42,102 @@ static int __init parse_ipt_params(const char
> > *str)
> >
> >  return 0;
> >  }
> > +
> > +static inline void ipt_load_msr(const struct ipt_ctx *ctx,
> > +   unsigned int addr_range)
> 
> Please let the compiler decide whether to inline such functions. The keyword 
> should only be used (with very few exceptions) in
> header files.

OK, get it.

> 
> > +{
> > +unsigned int i;
> > +
> > +wrmsrl(MSR_IA32_RTIT_STATUS, ctx->status);
> > +wrmsrl(MSR_IA32_RTIT_OUTPUT_BASE, ctx->output_base);
> > +wrmsrl(MSR_IA32_RTIT_OUTPUT_MASK, ctx->output_mask);
> > +wrmsrl(MSR_IA32_RTIT_CR3_MATCH, ctx->cr3_match);
> > +for ( i = 0; i < addr_range; i++ )
> 
> Wouldn't "nr" or "nr_addr" be a better parameter name?

Looks good to me.

> 
> > +{
> > +wrmsrl(MSR_IA32_RTIT_ADDR_A(i), ctx->addr[i * 2]);
> > +wrmsrl(MSR_IA32_RTIT_ADDR_B(i), ctx->addr[i * 2 + 1]);
> > +}
> > +}
> > +
> > +static inline void ipt_save_msr(struct ipt_ctx *ctx, unsigned int
> > +addr_range) {
> > +unsigned int i;
> > +
> > +rdmsrl(MSR_IA32_RTIT_STATUS, ctx->status);
> > +rdmsrl(MSR_IA32_RTIT_OUTPUT_BASE, ctx->output_base);
> > +rdmsrl(MSR_IA32_RTIT_OUTPUT_MASK, ctx->output_mask);
> > +rdmsrl(MSR_IA32_RTIT_CR3_MATCH, ctx->cr3_match);
> > +for ( i = 0; i < addr_range; i++ )
> > +{
> > +rdmsrl(MSR_IA32_RTIT_ADDR_A(i), ctx->addr[i * 2]);
> > +rdmsrl(MSR_IA32_RTIT_ADDR_B(i), ctx->addr[i * 2 + 1]);
> > +}
> > +}
> 
> So you save/restore them not at context switch, but at VM entry/exit time. 
> This means the title is misleading. But it raises efficiency
> questions:
> Is it really necessary to do it this often? In patch 7 you handle reads and 
> writes to the MSRs, but you don't disable the MSR
> intercepts (and judging from their titles no other patch is a candidate where 
> you might do that). If all writes are seen by Xen, why
> would you need to read all the MSRs here, when the majority is - afaict - not 
> modified by hardware?

when PT in disabled in guest (guest have capability to enable PT but 
RTIT_CTL.EN is 0), all the PT msrs read/write are intercepted and we don't need 
to save or restore during vm-exit/entry. When PT is enabled in guest, we need 
to save or restore the guest stat when vm-exit/entry.
What about add a flag to log the value of MSRs' changes so that we don't need 
save/restore the MSRs when guest not change these values?
> 
> > +void ipt_guest_enter(struct vcpu *v)
> > +{
> > +struct ipt_desc *ipt = v->arch.hvm_vmx.ipt_desc;
> 
> const
> 
> > +if ( !ipt )
> > +return;
> 
> Would seem better to put the check outside the call, so no call would be made 
> at all in the common case.

OK, looks good to me. Will fix it.

> 
> > +/*
> > + * Need re-initialize the guest state of IA32_RTIT_CTL
> > + * When this vcpu be scheduled to another Physical CPU.
> > + * TBD: Performance optimization. Add a new item in
> > + * struct ipt_desc to record the last pcpu, and check
> > + * if this vcpu is scheduled to another pcpu here (like vpmu).
> > + */
> > +vmx_vmcs_enter(v);
> > +__vmwrite(GUEST_IA32_RTIT_CTL, ipt->ipt_guest.ctl);
> > +vmx_vmcs_exit(v);
> 
> With the sole caller being vmx_vmenter_helper() there's no need to 
> vmx_vmcs_{enter,exit}() here afaict.

Get it.

> 
> > +int ipt_initialize(struct vcpu *v)
> > +{
> > +struct ipt_desc *ipt = NULL;
> 
> Pointless initializer.
> 
> > +unsigned int eax, tmp, addr_range;
> > +
> > +if ( !cpu_has_ipt || (ipt_mode == IPT_MODE_OFF) ||
> > + !(v->arch.hvm_vmx.secondary_exec_control &
> > + SECONDARY_EXEC_PT_USE_GPA) )
> 
> ipt_mode == IPT_MODE_OFF implies
> !(v->arch.hvm_vmx.secondary_exec_control & SECONDARY_EXEC_PT_USE_GPA) as per 
> patch 2, so no need for the separate check
> here (an ASSERT() inside the if() body would be fine). The same should 
> perhaps, if not already the case, be made true
> for !cpu_has_ipt.

Will fix it.

> 
> > +return 0;
> > +
> > +if ( cpuid_eax(IPT_CPUID) == 0 )
> > +return -EINVAL;
> > +
> > +cpuid_count(IPT_CPUID, 1, &eax, &tmp, &tmp, &tmp);
> > +addr_range = eax & IPT_ADDR_RANGE_MASK;
> 
> As per my remark further up - the use of "addr_range" should perhaps be 
> revisited throughout the patch/series.
> 
> > +ipt = _xzalloc(sizeof(struct ipt_desc) + sizeof(uint64_t) * addr_range 
> > * 2,
> > +   __alignof(*ipt));
> 
> Please don't effectively open-code xzalloc_bytes(). Also please use the type 
> of variables or expressions in scope instead of type
> names. And please get indentation right (won't be visible below anymore). IOW
> 
> ipt = xzalloc_bytes(sizeof(*ipt) + sizeof(ipt->addr_range[0]) * 
> addr_range * 2);

Will fix it.

> 
> > +void ipt_destroy(struct vcpu *v)
> > +{
> > +if ( v->arch.hvm_vmx.ipt_desc )
> > +{
> > +xfree(v->arch.hvm_vmx.ipt_desc);
> > +v->arch.hvm_v

Re: [Xen-devel] [PATCH v2 04/10] x86: Add Intel Processor Trace MSRs and bit definitions

2018-07-03 Thread Kang, Luwei
> > --- a/xen/include/asm-x86/msr-index.h
> > +++ b/xen/include/asm-x86/msr-index.h
> > @@ -548,4 +548,41 @@
> >  #define MSR_PKGC9_IRTL 0x0634
> >  #define MSR_PKGC10_IRTL0x0635
> >
> > +/* Intel PT MSRs */
> > +#define MSR_IA32_RTIT_CTL  0x0570
> > +#define RTIT_CTL_TRACEEN   (1ULL << 0)
> > +#define RTIT_CTL_CYCEN (1ULL << 1)
> > +#define RTIT_CTL_OS(1ULL << 2)
> > +#define RTIT_CTL_USR   (1ULL << 3)
> > +#define RTIT_CTL_PWR_EVT_EN(1ULL << 4)
> > +#define RTIT_CTL_FUP_ON_PTW(1ULL << 5)
> > +#define RTIT_CTL_FABRIC_EN (1ULL << 6)
> > +#define RTIT_CTL_CR3_FILTER(1ULL << 7)
> > +#define RTIT_CTL_TOPA  (1ULL << 8)
> > +#define RTIT_CTL_MTC_EN(1ULL << 9)
> > +#define RTIT_CTL_TSC_EN(1ULL << 10)
> > +#define RTIT_CTL_DIS_RETC  (1ULL << 11)
> > +#define RTIT_CTL_PTW_EN(1ULL << 12)
> > +#define RTIT_CTL_BRANCH_EN (1ULL << 13)
> > +#define RTIT_CTL_MTC_FREQ_OFFSET   14
> > +#define RTIT_CTL_MTC_FREQ  (0x0fULL << RTIT_CTL_MTC_FREQ_OFFSET)
> 
> No duplicates like these please - with MASK_EXTR() / MASK_INSR() having just 
> the mask (and no offset/shift value) is sufficient.

OK, get it. I just need to define
+#define RTIT_CTL_MTC_FREQ  (0x0fULL << RTIT_CTL_MTC_FREQ_OFFSET)
If I want to get the value of this filed (bit17:14) just call "MASK_EXTR(v, 
RTIT_CTL_MTC_FREQ)" 

> 
> > +#define RTIT_CTL_ADDR(n)   (0x0fULL << RTIT_CTL_ADDR_OFFSET(n))
> > +#define MSR_IA32_RTIT_STATUS   0x0571
> 
> Please add blank lines between one MSR (and its bits) and the next one.

Will fix it.

Thanks,
Luwei Kang

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 01/10] x86: add an flag to enable Intel Processor Trace in guest

2018-07-03 Thread Kang, Luwei
> > --- a/docs/misc/xen-command-line.markdown
> > +++ b/docs/misc/xen-command-line.markdown
> > @@ -1215,6 +1215,16 @@ Rather than only mapping RAM pages for IOMMU
> > accesses for Dom0, with this  option all pages not marked as unusable
> > in the E820 table will get a mapping  established.
> >
> > +### ipt
> > +> `= guest`
> > +
> > +> Default: `off`
> > +
> > +This option is use for switch on the Intel Processor Trace feature in
> > +HVM guest when 'ipt=guest'. By default, this feature is disabled in
> > +guest. Intel Processor Trace virtualization depend on EPT, so it can
> > +only enabled in HVM guest at present.
> > +
> >  ### irq\_ratelimit (x86)
> 
> Did you not notice the (x86) here when re-basing?

So, the option should be "### ipt (x86)" ?

> 
> > --- /dev/null
> > +++ b/xen/arch/x86/cpu/ipt.c
> > @@ -0,0 +1,42 @@
> > +/*
> > + * ipt.c: Support for Intel Processor Trace Virtualization.
> > + *
> > + * Copyright (c) 2018, Intel Corporation.
> > + *
> > + * This program is free software; you can redistribute it and/or
> > +modify it
> > + * under the terms and conditions of the GNU General Public License,
> > + * version 2, as published by the Free Software Foundation.
> > + *
> > + * This program is distributed in the hope it will be useful, but
> > +WITHOUT
> > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
> > +or
> > + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> > +License for
> > + * more details.
> > + *
> > + * You should have received a copy of the GNU General Public License
> > +along with
> > + * this program; If not, see .
> > + *
> > + * Author: Luwei Kang   */ #include
> > + #include  #include  #include
> > + #include 
> > +
> > +/* ipt: Flag to enable Intel Processor Trace (default off). */
> > +unsigned int __read_mostly ipt_mode = IPT_MODE_OFF; static int
> > +parse_ipt_params(const char *str);
> 
> I think it was pointed out before that the forward declaration can be avoided 
> if you move ...
> 
> > +custom_param("ipt", parse_ipt_params);
> 
> ... this line ...
> 
> > +static int __init parse_ipt_params(const char *str) {
> > +if ( !strcmp("guest", str) )
> > +ipt_mode = IPT_MODE_GUEST;
> > +else if ( str )
> > +{
> > +printk("Unknown Intel Processor Trace mode specified: '%s'\n", 
> > str);
> > +return -EINVAL;
> > +}
> > +
> > +return 0;
> > +}
> 
> ... here.

Get it. Will fix it.

> 
> > --- /dev/null
> > +++ b/xen/include/asm-x86/ipt.h
> > @@ -0,0 +1,29 @@
> > +/*
> > + * ipt.h: Intel Processor Trace virtualization for HVM domain.
> > + *
> > + * Copyright (c) 2018, Intel Corporation.
> > + *
> > + * This program is free software; you can redistribute it and/or
> > +modify it
> > + * under the terms and conditions of the GNU General Public License,
> > + * version 2, as published by the Free Software Foundation.
> > + *
> > + * This program is distributed in the hope it will be useful, but
> > +WITHOUT
> > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
> > +or
> > + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> > +License for
> > + * more details.
> > + *
> > + * You should have received a copy of the GNU General Public License
> > +along with
> > + * this program; If not, see .
> > + *
> > + * Author: Luwei Kang   */
> > +
> > +#ifndef __ASM_X86_HVM_IPT_H_
> > +#define __ASM_X86_HVM_IPT_H_
> > +
> > +#define IPT_MODE_OFF0
> > +#define IPT_MODE_GUEST  (1<<0)
> > +
> > +extern unsigned int ipt_mode;
> 
> At this point I can't see why the variable can't be bool. With the patch 
> being placed first in the series it is also impossible (without
> peeking into later patches) to judge whether its __read_mostly attribute is 
> actually appropriate.

OK, will change it to bool. About __read_mostly attribute,  I will remove it if 
not read frequency.

Thanks,
Luwei Kang

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 08/10] x86: Introduce a function to check the value of RTIT_CTL

2018-07-03 Thread Kang, Luwei
> > Any attempt to modify IA32_RTIT_CTL while TraceEn is set will result
> > in a #GP unless the same write also clears TraceEn.
> > Writes to IA32_RTIT_CTL that do not modify any bits will not cause a
> > #GP, even if TraceEn remains set.
> > MSR write that attempts to change bits marked reserved, or utilize
> > encodings marked reserved, will cause a #GP fault.
> 
> May I ask that you also add a similar code comment, perhaps ahead of the 
> function definition?

Get it. Will do that.

> 
> > --- a/xen/arch/x86/cpu/ipt.c
> > +++ b/xen/arch/x86/cpu/ipt.c
> > @@ -114,6 +114,114 @@ static int __init parse_ipt_params(const char *str)
> >  return 0;
> >  }
> >
> > +static int rtit_ctl_check(uint64_t new, uint64_t old)
> 
> It looks as if again you mean the function to return boolean, so please have 
> it have bool return type.

Get it.

> 
> > +{
> > +const struct cpuid_policy *p = current->domain->arch.cpuid;
> > +const struct ipt_desc *ipt_desc = current->arch.hvm_vmx.ipt_desc;
> > +uint64_t rtit_ctl_mask = ~((uint64_t)0);
> 
> Too many parentheses.

Here is a pointless initializer. Will fix it.

> > + */
> > +if ( new & rtit_ctl_mask )
> > +return 1;
> > +
> > +/*
> > + * Any attempt to modify IA32_RTIT_CTL while TraceEn is set will
> > + * result in a #GP unless the same write also clears TraceEn.
> > + */
> > +if ( (ipt_desc->ipt_guest.ctl & RTIT_CTL_TRACEEN) &&
> > +((ipt_desc->ipt_guest.ctl ^ new) & ~RTIT_CTL_TRACEEN) )
> 
> Why the ^ ? You only need to check whether new has the bit clear.

I mean if change any bits (set or clear, not include RTIT_CTL_TRACEEN) with PT 
is enabled (RTIT_CTL_TRACEEN is set) will inject an #GP.

> Also please use "old" wherever possible, if you already have it passed into 
> the function. This way it'll become obvious that the
> "nothing changed" case is already handled by the very first if().

I think also can remove "old" (decrease a function parameter) and all use 
ipt_desc->ipt_guest.ctl instead.

> 
> > +return 1;
> > +
> > +/*
> > + * WRMSR to IA32_RTIT_CTL that sets TraceEn but clears this bit
> > + * and FabricEn would cause #GP, if
> > + * CPUID.(EAX=14H, ECX=0):ECX.SNGLRGNOUT[bit 2] = 0
> > + */
> > +   if ( (new & RTIT_CTL_TRACEEN) && !(new & RTIT_CTL_TOPA) &&
> > +!(new & RTIT_CTL_FABRIC_EN) &&
> > +!ipt_cap(p->ipt.raw, IPT_CAP_single_range_output) )
> > +return 1;
> > +/*
> > + * MTCFreq, CycThresh and PSBFreq encodings check, any MSR write that
> > + * utilize encodings marked reserved will casue a #GP fault.
> > + */
> > +val = ipt_cap(p->ipt.raw, IPT_CAP_mtc_period);
> > +if ( ipt_cap(p->ipt.raw, IPT_CAP_mtc) &&
> > +!test_bit((new & RTIT_CTL_MTC_FREQ) >>
> > +RTIT_CTL_MTC_FREQ_OFFSET, &val) )
> 
> Indentation.
> 
> > @@ -171,6 +279,8 @@ int ipt_do_wrmsr(unsigned int msr, uint64_t msr_content)
> >  switch ( msr )
> >  {
> >  case MSR_IA32_RTIT_CTL:
> > +if ( rtit_ctl_check(msr_content, ipt_desc->ipt_guest.ctl) )
> > +return 1;
> >  ipt_desc->ipt_guest.ctl = msr_content;
> >  __vmwrite(GUEST_IA32_RTIT_CTL, msr_content);
> >  break;
> 
> Without this I don't see how the previous patch is complete.

What about merge this patch with patch 7 ?

Thanks,
Luwei Kang

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 09/10] x86: Disable Intel Processor Trace when VMXON in L1 guest

2018-07-03 Thread Kang, Luwei
> > @@ -1519,6 +1520,14 @@ int nvmx_handle_vmxon(struct cpu_user_regs *regs)
> >  v->arch.hvm_vmx.launched = 0;
> >  vmsucceed(regs);
> >
> > +if ( v->arch.hvm_vmx.ipt_desc )
> > +{
> > +v->arch.hvm_vmx.ipt_desc->ipt_guest.ctl = 0;
> > +vmx_vmcs_enter(current);
> > +__vmwrite(GUEST_IA32_RTIT_CTL, 0);
> > +vmx_vmcs_exit(current);
> > +}
> 
> I again don't understand why enter and exit the VMCS here.
> 
Will remove this because it in vmx_vmenter_helper(). 

Thanks,
Luwei Kang

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 05/10] x86: Implement Intel Processor Trace context switch

2018-07-04 Thread Kang, Luwei
> >> > @@ -40,3 +42,102 @@ static int __init parse_ipt_params(const char
> >> > +static inline void ipt_save_msr(struct ipt_ctx *ctx, unsigned int
> >> > +addr_range) {
> >> > +unsigned int i;
> >> > +
> >> > +rdmsrl(MSR_IA32_RTIT_STATUS, ctx->status);
> >> > +rdmsrl(MSR_IA32_RTIT_OUTPUT_BASE, ctx->output_base);
> >> > +rdmsrl(MSR_IA32_RTIT_OUTPUT_MASK, ctx->output_mask);
> >> > +rdmsrl(MSR_IA32_RTIT_CR3_MATCH, ctx->cr3_match);
> >> > +for ( i = 0; i < addr_range; i++ )
> >> > +{
> >> > +rdmsrl(MSR_IA32_RTIT_ADDR_A(i), ctx->addr[i * 2]);
> >> > +rdmsrl(MSR_IA32_RTIT_ADDR_B(i), ctx->addr[i * 2 + 1]);
> >> > +}
> >> > +}
> >>
> >> So you save/restore them not at context switch, but at VM entry/exit
> >> time. This means the title is misleading. But it raises efficiency
> >> questions:
> >> Is it really necessary to do it this often? In patch 7 you handle
> >> reads and writes to the MSRs, but you don't disable the MSR
> >> intercepts (and judging from their titles no other patch is a candidate 
> >> where you might do that). If all writes are seen by Xen, why
> would you need to read all the MSRs here, when the majority is - afaict - not 
> modified by hardware?
> >
> > when PT in disabled in guest (guest have capability to enable PT but
> > RTIT_CTL.EN is 0), all the PT msrs read/write are intercepted and we
> > don't need to save or restore during vm-exit/entry. When PT is enabled
> > in guest, we need to save or restore the guest stat when vm-exit/entry.
> 
> Why for MSRs which don't get changed by hardware?
> 
> > What about add a flag to log the value of MSRs' changes so that we
> > don't need save/restore the MSRs when guest not change these values?
> 
> I'm afraid it's not clear to me what "log the value" is supposed to mean here.

I mean add a new flag to mark if the value of Intel PT MSRs is changed by 
guest. If guest don't have any change that we don't need to save/restore the 
guest PT MSRs value to real hardware when VM-exit/entry.

> 
> >> > @@ -466,11 +467,16 @@ static int vmx_vcpu_initialise(struct vcpu *v)
> >> >  if ( v->vcpu_id == 0 )
> >> >  v->arch.user_regs.rax = 1;
> >> >
> >> > +rc = ipt_initialize(v);
> >> > +if ( rc )
> >> > +dprintk(XENLOG_ERR, "%pv: Failed to init Intel Processor
> >> > + Trace.\n", v);
> >>
> >> For such a message to be helpful, please also log rc. And no full stop in 
> >> log messages please (again with very few exceptions).
> >
> > Not full understand here. What is the " no full stop in log messages " mean?
> 
> "full stop" is the final period in a sentence. I.e. you want
> 
> dprintk(XENLOG_ERR, "%pv: Failed to init Intel Processor Trace\n", v);

Change like this ?
dprintk(XENLOG_ERR, "%pv: Failed to init Intel Processor Trace: err=%d.\n", v, 
rc);

Thanks,
Luwei Kang

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 06/10] x86: Introduce a new function to get capability of Intel PT

2018-07-04 Thread Kang, Luwei
> >> > +#define IPT_CAP(_n, _l, _r, _m)   \
> >> > +[IPT_CAP_ ## _n] = { .name = __stringify(_n), .leaf = _l, \
> >> > +.reg = _r, .mask = _m }
> >> > +
> >> > +static struct ipt_cap_desc {
> >> > +const char*name;
> >> > +unsigned int  leaf;
> >> > +unsigned char reg;
> >>
> >> I don't think leaf needs to be full 32 bits wide? Once shrunk by at
> >> least two bits, the size of the overall structure could go down from 24 to 
> >> 16 bytes.
> >
> > OK, will change it from " unsigned int  " to "unsinged char".
> 
> I'd prefer if you used bit fields, as was meant to be implied by my reply.

like this? If two bits is too few for "leaf"?

static const struct ipt_cap_desc {
const char*name;
unsigned char leaf:2;
unsigned char reg:2;
unsinged int mask;
}

> 
> >> > +static unsigned int ipt_cap(const struct cpuid_leaf *cpuid_ipt,
> >> > +enum ipt_cap cap) {
> >> > +const struct ipt_cap_desc *cd = &ipt_caps[cap];
> >> > +unsigned int shift = ffs(cd->mask) - 1;
> >>
> >> Do you really need this?
> >>
> >> > +unsigned int val = 0;
> >> > +
> >> > +cpuid_ipt += cd->leaf;
> >> > +
> >> > +switch ( cd->reg )
> >> > +{
> >> > +case EAX:
> >> > +val = cpuid_ipt->a;
> >> > +break;
> >> > +case EBX:
> >> > +val = cpuid_ipt->b;
> >> > +break;
> >> > +case ECX:
> >> > +val = cpuid_ipt->c;
> >> > +break;
> >> > +case EDX:
> >> > +val = cpuid_ipt->d;
> >> > +break;
> >> > +}
> >> > +
> >> > +return (val & cd->mask) >> shift;
> >>
> >> If all masks are indeed contiguous series of set bits, MASK_EXTR()
> >> can be
> > used here afaict.
> >
> > Yes, it is a good define to me. Will fix it.
> >
> >>
> >> > +}
> >> > +
> >> >  static int __init parse_ipt_params(const char *str)  {
> >> >  if ( !strcmp("guest", str) )
> >>
> >> So this is the end of the changes to this file, and the function you
> >> introduce is static. I'm pretty sure compilers will warn about the
> >> unused static, and hence the build will fail at this point of the series 
> >> (due to -Werror). I think you want to introduce the function
> together with its first user.
> >
> > I can't reproduce this issue by:
> > #./configure
> > # make build-xen (-Werror has been included during build)
> > Could you tell me how to?
> 
> There is certainly the possibility that gcc versions differ in this regard.
> But I'm sure it's clear to you that the code should build fine with all 
> supported versions. There's also the possibility that I'm
> overlooking something.


Get it. Will test it.

Thanks,
Luwei Kang

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 05/10] x86: Implement Intel Processor Trace context switch

2018-07-04 Thread Kang, Luwei
> >> >> > @@ -40,3 +42,102 @@ static int __init parse_ipt_params(const
> >> >> > char
> >> >> > +static inline void ipt_save_msr(struct ipt_ctx *ctx, unsigned
> >> >> > +int
> >> >> > +addr_range) {
> >> >> > +unsigned int i;
> >> >> > +
> >> >> > +rdmsrl(MSR_IA32_RTIT_STATUS, ctx->status);
> >> >> > +rdmsrl(MSR_IA32_RTIT_OUTPUT_BASE, ctx->output_base);
> >> >> > +rdmsrl(MSR_IA32_RTIT_OUTPUT_MASK, ctx->output_mask);
> >> >> > +rdmsrl(MSR_IA32_RTIT_CR3_MATCH, ctx->cr3_match);
> >> >> > +for ( i = 0; i < addr_range; i++ )
> >> >> > +{
> >> >> > +rdmsrl(MSR_IA32_RTIT_ADDR_A(i), ctx->addr[i * 2]);
> >> >> > +rdmsrl(MSR_IA32_RTIT_ADDR_B(i), ctx->addr[i * 2 + 1]);
> >> >> > +}
> >> >> > +}
> >> >>
> >> >> So you save/restore them not at context switch, but at VM
> >> >> entry/exit time. This means the title is misleading. But it raises
> >> >> efficiency
> >> >> questions:
> >> >> Is it really necessary to do it this often? In patch 7 you handle
> >> >> reads and writes to the MSRs, but you don't disable the MSR
> >> >> intercepts (and judging from their titles no other patch is a
> >> >> candidate
> > where you might do that). If all writes are seen by Xen, why
> >> would you need to read all the MSRs here, when the majority is -
> >> afaict - not
> > modified by hardware?
> >> >
> >> > when PT in disabled in guest (guest have capability to enable PT
> >> > but RTIT_CTL.EN is 0), all the PT msrs read/write are intercepted
> >> > and we don't need to save or restore during vm-exit/entry. When PT
> >> > is enabled in guest, we need to save or restore the guest stat when 
> >> > vm-exit/entry.
> >>
> >> Why for MSRs which don't get changed by hardware?
> >>
> >> > What about add a flag to log the value of MSRs' changes so that we
> >> > don't need save/restore the MSRs when guest not change these values?
> >>
> >> I'm afraid it's not clear to me what "log the value" is supposed to mean 
> >> here.
> >
> > I mean add a new flag to mark if the value of Intel PT MSRs is changed
> > by guest. If guest don't have any change that we don't need to
> > save/restore the guest PT MSRs value to real hardware when VM-exit/entry.
> 
> Okay, in which case back to the original question: Without disabling the 
> intercepts, you know what the guest wrote last. Why read
> the MSR then?

Oh, understand. I re-check these MSRs in spec and Only IA32_RTIT_STATUS is an 
exception. It can be changed by hardware.
Bit[3:0] are write ignore. 
Bit[5:4] : these bit are set by hardware and once it set only software can 
clear it.
Bit[48:32]: write by processor and can clear or modify this filed at any time 
when PT is enabled.
So, I think this register (IA32_RTIT_STATUS) is needed read after vm-exit. 
Other MSRs may don't need.
Bit[5:4] should be cleared in any way before vm-entry because only software can 
clear it.


> 
> >> >> > @@ -466,11 +467,16 @@ static int vmx_vcpu_initialise(struct vcpu *v)
> >> >> >  if ( v->vcpu_id == 0 )
> >> >> >  v->arch.user_regs.rax = 1;
> >> >> >
> >> >> > +rc = ipt_initialize(v);
> >> >> > +if ( rc )
> >> >> > +dprintk(XENLOG_ERR, "%pv: Failed to init Intel
> >> >> > + Processor Trace.\n", v);
> >> >>
> >> >> For such a message to be helpful, please also log rc. And no full
> >> >> stop in
> > log messages please (again with very few exceptions).
> >> >
> >> > Not full understand here. What is the " no full stop in log messages "
> > mean?
> >>
> >> "full stop" is the final period in a sentence. I.e. you want
> >>
> >> dprintk(XENLOG_ERR, "%pv: Failed to init Intel Processor
> >> Trace\n",
> > v);
> >
> > Change like this ?
> > dprintk(XENLOG_ERR, "%pv: Failed to init Intel Processor Trace:
> > err=%d.\n", v, rc);
> 
> Excuse me - I've told you to omit the full stop, and there it is again.
> Apart from that, yes, this is one option. A slightly short one we use here 
> and there is
> 
> dprintk(XENLOG_ERR, "%pv: Failed to init Intel Processor Trace (%d)\n", v, 
> rc);

I Just understand. "Full stop" is the  "." at the end of the sentence. We don't 
need that. :)

Thanks,
Luwei Kang

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 06/10] x86: Introduce a new function to get capability of Intel PT

2018-07-04 Thread Kang, Luwei
> >> >> > +#define IPT_CAP(_n, _l, _r, _m)   \
> >> >> > +[IPT_CAP_ ## _n] = { .name = __stringify(_n), .leaf = _l, \
> >> >> > +.reg = _r, .mask = _m }
> >> >> > +
> >> >> > +static struct ipt_cap_desc {
> >> >> > +const char*name;
> >> >> > +unsigned int  leaf;
> >> >> > +unsigned char reg;
> >> >>
> >> >> I don't think leaf needs to be full 32 bits wide? Once shrunk by
> >> >> at least two bits, the size of the overall structure could go down from 
> >> >> 24 to 16 bytes.
> >> >
> >> > OK, will change it from " unsigned int  " to "unsinged char".
> >>
> >> I'd prefer if you used bit fields, as was meant to be implied by my reply.
> >
> > like this? If two bits is too few for "leaf"?
> >
> > static const struct ipt_cap_desc {
> > const char*name;
> > unsigned char leaf:2;
> > unsigned char reg:2;
> > unsinged int mask;
> > }
> 
> Yes. As suggested before I'd use more bits for leaf, to avoid this needing to 
> change immediately once a new leaf becomes known.
> After all the goal is only to have leaf and reg together fit in 32 bits. 
> Making leaf 8 bits wide for now would likely help generated code.
> And please don't use unsigned char in cases like this where you don't really 
> need the more narrow type - be as close to what the
> standard allows without extensions as possible; IOW unsigned int here.
> 

static const struct ipt_cap_desc {
const char*name;
unsigned int leaf:8;
unsigned int reg:2;
unsinged int mask;
}

Thanks,
Luwei Kang

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 03/10] x86: Add Intel Processor Trace support for cpuid

2018-07-12 Thread Kang, Luwei
> >>> On 30.05.18 at 15:27,  wrote:
> > @@ -759,12 +760,19 @@ int xc_cpuid_apply_policy(xc_interface *xch, uint32_t 
> > domid,
> >  continue;
> >  }
> >
> > +if ( input[0] == 0x14 )
> > +{
> > +input[1]++;
> > +if ( input[1] == 1 )
> > +continue;
> > +}
> 
> Together with what's there and what iirc Andrew's series puts here, this 
> should become a switch() imo.

Why use switch() here? I don't think need change to switch() and I can't find 
any example in this function use switch(). For example leaf 4 is also implement 
like this.

/* Intel cache descriptor leaves. */
if ( input[0] == 4 )
{
input[1]++;
/* More to do? Then loop keeping %%eax==0x0004. */
if ( (regs[0] & 0x1f) != 0 )
continue;
}

> 
> > @@ -583,7 +584,19 @@ void recalculate_cpuid_policy(struct domain *d)
> >  __clear_bit(X86_FEATURE_VMX, max_fs);
> >  __clear_bit(X86_FEATURE_SVM, max_fs);
> >  }
> > +
> > +/*
> > + * Hide Intel Processor trace feature when hardware not support
> > + * PT-VMX or ipt option is off.
> > + */
> > +if ( ipt_mode == IPT_MODE_OFF )
> > +{
> > +__clear_bit(X86_FEATURE_IPT, max_fs);
> > +zero_leaves(p->ipt.raw, 0, ARRAY_SIZE(p->ipt.raw) - 1);
> > +}
> 
> The clearing of bits in max_fs further up is needed here because this varies 
> depending on domain config. You, otoh, put a
> conditional here which is not going to change post boot. This instead belongs 
> into
> calculate_hvm_max_policy() I believe.

ipt_mode is any global parameter for all domain. Move to 
calculate_hvm_max_policy() is good to me.
Will fix in next version.

> 
> > @@ -101,6 +102,10 @@ static int update_domain_cpuid_info(struct domain *d,
> >  p->feat.raw[ctl->input[1]] = leaf;
> >  break;
> >
> > +case IPT_CPUID:
> > +p->ipt.raw[ctl->input[1]] = leaf;
> > +break;
> 
> This lacks a bounds check of ctl->input[1] (in the earlier switch()).

Oh, get it. 

> 
> > --- a/xen/include/asm-x86/cpufeature.h
> > +++ b/xen/include/asm-x86/cpufeature.h
> > @@ -102,6 +102,7 @@
> >  #define cpu_has_mpx boot_cpu_has(X86_FEATURE_MPX)
> >  #define cpu_has_rdseed  boot_cpu_has(X86_FEATURE_RDSEED)
> >  #define cpu_has_smapboot_cpu_has(X86_FEATURE_SMAP)
> > +#define cpu_has_ipt boot_cpu_has(X86_FEATURE_IPT)
> 
> This definition is unused.

Will remove it.

> 
> > --- a/xen/include/asm-x86/cpuid.h
> > +++ b/xen/include/asm-x86/cpuid.h
> > @@ -58,10 +58,11 @@ DECLARE_PER_CPU(struct cpuidmasks, cpuidmasks);
> >  /* Default masking MSR values, calculated at boot. */  extern struct
> > cpuidmasks cpuidmask_defaults;
> >
> > -#define CPUID_GUEST_NR_BASIC  (0xdu + 1)
> > +#define CPUID_GUEST_NR_BASIC  (0x14u + 1)
> 
> Is there anything to convince me that the intermediate leaves don't need any 
> further handling added anywhere? Same question btw
> for the libxc side bumping of DEF_MAX_BASE.

They are all zero and meaningless in these intermediate leaves. So I think we 
don't need do anything. what is your concern ?

> 
> > @@ -166,6 +167,15 @@ struct cpuid_policy
> >  } comp[CPUID_GUEST_NR_XSTATE];
> >  } xstate;
> >
> > +/* Structured feature leaf: 0x0014[xx] */
> > +union {
> > +struct cpuid_leaf raw[CPUID_GUEST_NR_IPT];
> > +struct {
> > +/* Subleaf 0. */
> > +uint32_t max_subleaf;
> > +};
> > +} ipt;
> 
> In particular this looks to be placed earlier than it should be (in other 
> words I'm getting the impression that you failed to insert
> some padding for the skipped leaves).

I think we don't need add some padding for skipped leaves because these are 
accessed by name (e.g. xstate, ipt ...) not offset.

Thanks,
Luwei Kang

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] x86 Community Call - Wed July 11, 14:00 - 15:00 UTC - Minutes

2018-07-12 Thread Kang, Luwei
Hi Lars,
I think I have sent the minutes of design session to you. I attached the email 
in case you can’t found.

Thanks,
Luwei Kang

From: Lars Kurth [mailto:lars.ku...@citrix.com]
Sent: Thursday, July 12, 2018 9:23 PM
To: Ji, John ; xen-devel 
Cc: committ...@xenproject.org; Tamas K Lengyel ; 
intel-xen ; daniel.ki...@oracle.com; Roger Pau Monne 
; Christopher Clark ; Rich 
Persaud ; Brian Woods ; Stefano 
Stabellini ; Julien Grall ; 
Juergen Gross 
Subject: Re: x86 Community Call - Wed July 11, 14:00 - 15:00 UTC - Minutes

John,
Thank you. I have notes and slides for SGX, which are already published and 
Intel Processor Trace (just the slides – already published, but I am not sure 
whether these already were updated to reflect the design discussion) and 
EPT-Based Sub-Page Protection (just the slides – already published)
Lars

From: "Ji, John" mailto:john...@intel.com>>
Date: Thursday, 12 July 2018 at 13:33
To: Lars Kurth mailto:lars.ku...@citrix.com>>, xen-devel 
mailto:xen-devel@lists.xenproject.org>>
Cc: "committ...@xenproject.org" 
mailto:committ...@xenproject.org>>, Tamas K Lengyel 
mailto:tamas.k.leng...@gmail.com>>, intel-xen 
mailto:intel-...@intel.com>>, 
"daniel.ki...@oracle.com" 
mailto:daniel.ki...@oracle.com>>, Roger Monne 
mailto:roger@citrix.com>>, Christopher Clark 
mailto:christopher.w.cl...@gmail.com>>, Rich 
Persaud mailto:pers...@gmail.com>>, Brian Woods 
mailto:brian.wo...@amd.com>>, Stefano Stabellini 
mailto:sstabell...@kernel.org>>, Julien Grall 
mailto:julien.gr...@arm.com>>, Juergen Gross 
mailto:jgr...@suse.com>>
Subject: RE: x86 Community Call - Wed July 11, 14:00 - 15:00 UTC - Minutes

I will Yu and Yi to send out discussion notes.

>>### Add vNVDIMM support to HVM domains
>>Stakeholders: Zhang Yi, Intel, Zhang Yu, Intel, George Dunlap, Citrix ``` _As 
>>far as I understand a simple and clean way to implement this has been found, 
>>but the design session notes are still missing_

>>_We spent almost two days on NVDIMM related discussions: we have something 
>>that should be fairly simple and easy to implement. Dan Williams is happy to 
>>take changes into upstream as long as they are sensible._



Best Regards

John Ji


-Original Message-
From: Lars Kurth [mailto:lars.ku...@citrix.com]
Sent: Thursday, July 12, 2018 5:07 PM
To: xen-devel 
mailto:xen-devel@lists.xenproject.org>>
Cc: committ...@xenproject.org; Tamas K 
Lengyel mailto:tamas.k.leng...@gmail.com>>; 
intel-xen mailto:intel-...@intel.com>>; 
daniel.ki...@oracle.com; Roger Pau Monne 
mailto:roger@citrix.com>>; Christopher Clark 
mailto:christopher.w.cl...@gmail.com>>; Rich 
Persaud mailto:pers...@gmail.com>>; Brian Woods 
mailto:brian.wo...@amd.com>>; Stefano Stabellini 
mailto:sstabell...@kernel.org>>; Julien Grall 
mailto:julien.gr...@arm.com>>; Juergen Gross 
mailto:jgr...@suse.com>>
Subject: Re: x86 Community Call - Wed July 11, 14:00 - 15:00 UTC - Minutes

Also attached minutes as PDF and Markdown

# Agenda and Minutes: x86 Community Call July 2018

_No new items were added to the agenda._ ​ _Minutes are added in blue (in the 
PDF only)_

## Attendees

Lars Kurth, Citrix
Roger Pau Monne, Citrix
Juergen Gross, Suse
Jan Beulich, Suse
Christopher Clark, OpenXT
Janakarajan Natarajan, AMD
Brian Woods, AMD
Rich Persaud, ​OpenXT
George Dunlap, Citrix
Wei, Andy, Paul - Citrix

## Release Cadence for Xen 4.12

Following the release cadence session at the developer summit (see 
https://lists.xenproject.org/archives/html/xen-devel/2018-07/threads.html#00166​
 & 
https://docs.google.com/document/d/1W7OuISUau-FtPG6tIinD4GXYFb-hKDjaqTj84pogNrA/
edit​) we have to make a decision whether
* Go on as we are for 4.
* Move to 9 months, until we fixed the underlying issues as outlined in the 
thread and
write-up: the problem is that unless we get some sort of commitment to address 
the issues, just changing the release cadence will not make a difference
* Skip a release as a one-off: Set ourselves some goals that must be achieved 
in this cycle around testing - this will need some commitment from vendors

I was planning to allocate up to 30 minutes to this discussion

Juergen: raises the point that keeping the release cadence at 6 months is very 
unfair on Jan who has raised many times that the workload resulting from having 
to maintain so many release branches would be too high. After running 6 monthly 
releases for some time, this has in fact come true, when at the time Jan’s 
concerns were dismissed. The overhead breaks down into backporting fixes, 
backporting security fixes and dealing with the release mechanics.

Jan: raised the point that hardly anyone responds to calls for back-ports and 
if so, only send change-sets and lat Jan do the backporting. Jan also says he 
suspects that people may not respond to backport requests, because that would 
require them to backport the patch

Re: [Xen-devel] [PATCH v2 00/10] Intel Processor Trace virtulization enabling

2018-05-30 Thread Kang, Luwei
> -Original Message-
> From: Julien Grall [mailto:julien.gr...@arm.com]
> Sent: Wednesday, May 30, 2018 11:15 PM
> To: Kang, Luwei ; xen-de...@lists.xen.org
> Cc: andrew.coop...@citrix.com; george.dun...@eu.citrix.com; 
> ian.jack...@eu.citrix.com; jbeul...@suse.com;
> konrad.w...@oracle.com; sstabell...@kernel.org; t...@xen.org; 
> wei.l...@citrix.com; Nakajima, Jun ; Tian,
> Kevin 
> Subject: Re: [PATCH v2 00/10] Intel Processor Trace virtulization enabling
> 
> Hi,
> 
> Can you please avoid CC everyone on each patch? You can use 
> scripts/get_maintainers.pl on each patch to see who is required to be
> CCed.

OK, get it. I use script/get_maintainers.pl to get the people who need to be CC 
and indeed different patch may include different peoples. If somebody  just 
receive one patch of this patch set may feel a little strange and don't 
understand the context. So I CC all the peoples who is mentioned in this patch 
set.

> 
> Cheers,
> 
> --
> Julien Grall
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 00/10] Intel Processor Trace virtulization enabling

2018-05-31 Thread Kang, Luwei
> >> -Original Message-
> >> From: Julien Grall [mailto:julien.gr...@arm.com]
> >> Sent: Wednesday, May 30, 2018 11:15 PM
> >> To: Kang, Luwei ; xen-de...@lists.xen.org
> >> Cc: andrew.coop...@citrix.com; george.dun...@eu.citrix.com;
> >> ian.jack...@eu.citrix.com; jbeul...@suse.com; konrad.w...@oracle.com;
> >> sstabell...@kernel.org; t...@xen.org; wei.l...@citrix.com; Nakajima,
> >> Jun ; Tian, Kevin 
> >> Subject: Re: [PATCH v2 00/10] Intel Processor Trace virtulization
> >> enabling
> >>
> >> Hi,
> >>
> >> Can you please avoid CC everyone on each patch? You can use
> >> scripts/get_maintainers.pl on each patch to see who is required to be CCed.
> >
> > OK, get it. I use script/get_maintainers.pl to get the people who need to 
> > be CC and indeed different patch may include different
> peoples. If somebody  just receive one patch of this patch set may feel a 
> little strange and don't understand the context. So I CC all
> the peoples who is mentioned in this patch set.
> 
> That's usually why I CC everyone on the cover letter. Then for each patch I 
> CC only the necessary person.
> 
> This avoids maintainers to have to look for what they should review/ack.

Get it. This is a good way to me.  Thanks.

Luwei Kang
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RESEND v1 0/7] Intel Processor Trace virtulization enabling

2018-01-16 Thread Kang, Luwei
> >>> On 15.01.18 at 19:12,  wrote:
> > Luwei Kang (7):
> >   x86: add a flag to enable Intel processor trace
> >   x86: configure vmcs for Intel processor trace virtualization
> >   x86: add intel proecessor trace support for cpuid
> >   x86: add intel processor trace context
> >   x86: Implement Intel Processor Trace context switch
> >   x86: Implement Intel Processor Trace MSRs read/write
> >   x86: Disable Intel Processor Trace when VMXON in L1 guest
> 
> How can this be a re-send of v1 when the original v1 consisted of just 6 
> patches?
>

Yes, I make some change in Intel PT context switch(only save /load guest state 
when Intel PT is enabled in guest), Intel PT MSRs pass through strategy (only 
passthrough MSRs when PT is enabled in guest)  and  add a patch (patch 7) to 
disable Intel PT-VMX in nested.
Because of there just have some high level comments from community and above 
all changes is from my point of view. So I re-send this patch set still as v1.

Thanks,
Luwei Kang

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RESEND v1 0/7] Intel Processor Trace virtulization enabling

2018-01-16 Thread Kang, Luwei
> >>> On 16.01.18 at 10:02,  wrote:
> >> >>> On 15.01.18 at 19:12,  wrote:
> >> > Luwei Kang (7):
> >> >   x86: add a flag to enable Intel processor trace
> >> >   x86: configure vmcs for Intel processor trace virtualization
> >> >   x86: add intel proecessor trace support for cpuid
> >> >   x86: add intel processor trace context
> >> >   x86: Implement Intel Processor Trace context switch
> >> >   x86: Implement Intel Processor Trace MSRs read/write
> >> >   x86: Disable Intel Processor Trace when VMXON in L1 guest
> >>
> >> How can this be a re-send of v1 when the original v1 consisted of just 6 
> >> patches?
> >>
> >
> > Yes, I make some change in Intel PT context switch(only save /load
> > guest state when Intel PT is enabled in guest), Intel PT MSRs pass
> > through strategy (only passthrough MSRs when PT is enabled in guest)
> > and  add a patch (patch
> > 7) to disable Intel PT-VMX in nested.
> > Because of there just have some high level comments from community and
> > above all changes is from my point of view. So I re-send this patch set 
> > still as v1.
> 
> Any change makes it a new version. If you don't want to use v2, use v1.5 or 
> v1.1 in such a case.
> 

Get it.

Thanks,
Luwei Kang


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

RE: [PATCH v1 0/7] Implement support for external IPT monitoring

2020-06-17 Thread Kang, Luwei
> -Original Message-
> From: Tian, Kevin 
> Sent: Wednesday, June 17, 2020 9:35 AM
> To: Michał Leszczyński ; Andrew Cooper
> 
> Cc: Xen-devel ; Jan Beulich
> ; Wei Liu ; Roger Pau Monné
> ; Nakajima, Jun ; George
> Dunlap ; Ian Jackson ;
> Julien Grall ; Stefano Stabellini ;
> Kang, Luwei 
> Subject: RE: [PATCH v1 0/7] Implement support for external IPT monitoring
> 
> +Luwei, who developed PT for KVM and is the best one who can help
> review VMX changes from Intel side. Please include him in future post or
> discussion.
> 
> > -Original Message-
> > From: Michał Leszczyński 
> > Sent: Wednesday, June 17, 2020 2:48 AM
> > To: Andrew Cooper 
> > Cc: Xen-devel ; Jan Beulich
> > ; Wei Liu ; Roger Pau Monné
> > ; Nakajima, Jun ; Tian,
> > Kevin ; George Dunlap
> > ; Ian Jackson ;
> > Julien Grall ; Stefano Stabellini
> > 
> > Subject: Re: [PATCH v1 0/7] Implement support for external IPT
> > monitoring
> >
> > - 16 cze 2020 o 20:17, Andrew Cooper andrew.coop...@citrix.com
> > napisał(a):
> >
> > > On 16/06/2020 16:16, Michał Leszczyński wrote:
> > >> Intel Processor Trace is an architectural extension available in
> > >> modern
> > Intel
> > >> family CPUs. It allows recording the detailed trace of activity
> > >> while the processor executes the code. One might use the recorded
> > >> trace to
> > reconstruct
> > >> the code flow. It means, to find out the executed code paths,
> > >> determine branches taken, and so forth.
> > >>
> > >> The abovementioned feature is described in Intel(R) 64 and IA-32
> > Architectures
> > >> Software Developer's Manual Volume 3C: System Programming Guide,
> > Part 3,
> > >> Chapter 36: "Intel Processor Trace."
> > >>
> > >> This patch series implements an interface that Dom0 could use in
> > >> order to
> > enable
> > >> IPT for particular vCPUs in DomU, allowing for external monitoring.
> > >> Such a feature has numerous applications like malware monitoring,
> > >> fuzzing, or performance testing.
> > >
> > > Hello,
> > >
> > > I'm very excited to see support like this appearing.  However, be
> > > aware that we're currently in code freeze for the 4.14 release, so
> > > in-depth reviews will probably be delayed somewhat due to our bug
> > > queue and release activities.
> >
> > Sure, take your time :)
> >
> >
> > >
> > > That said, I've had a very quick look through the series, and have a
> > > few general questions first.
> > >
> > > AFAICT, this is strictly for external monitoring of the VM, not for
> > > the VM to use itself?  If so, it shouldn't have the H tag here:
> > >
> > > XEN_CPUFEATURE(IPT,   5*32+25) /*H  Intel Processor Trace */
> > >
> > > because that exposes the feature to the guest, with the implication
> > > that all other parts of the feature work as advertised.
> >
> > Ok, I will remove the H tag.
> >
> >
> > >
> > >
> > > Are there any restrictions on EPT being enabled in the first place?
> > > I'm not aware of any, and in principle we could use this
> > > functionality for PV guests as well (using the CPL filter).
> > > Therefore, I think it would be helpful to not tie the functionality
> > > to HVM guests, even if that is the only option enabled to start with.
> >
> > I think at the moment it's not required to have EPT. This patch series
> > doesn't use any translation feature flags, so the output address is
> > always a machine physical address, regardless of context. I will check
> > if it could be easily used with PV.
> >
> >
> > >
> > > The buffer mapping and creation logic is fairly problematic.
> > > Instead of fighting with another opencoded example, take a look at
> > > the IOREQ server's use of "acquire resource" which is a mapping
> > > interface which supports allocating memory on behalf of the guest,
> > > outside of the guest memory, for use by control tools.
> > >
> > > I think what this wants is a bit somewhere in domain_create to
> > > indicate that external tracing is used for this domain (and allocate
> > > whatever structures/buffers are necessary), acquire resource to map
> > > the buffers themselves, and a domctl for any necessary runtime cont

RE: [PATCH v1 0/7] Implement support for external IPT monitoring

2020-06-17 Thread Kang, Luwei
> > > -Original Message-
> > > From: Tian, Kevin 
> > > Sent: Wednesday, June 17, 2020 9:35 AM
> > > To: Michał Leszczyński ; Andrew Cooper
> > > 
> > > Cc: Xen-devel ; Jan Beulich
> > > ; Wei Liu ; Roger Pau Monné
> > > ; Nakajima, Jun ;
> > > George Dunlap ; Ian Jackson
> > > ; Julien Grall ; Stefano
> > > Stabellini ; Kang, Luwei
> > > 
> > > Subject: RE: [PATCH v1 0/7] Implement support for external IPT
> > > monitoring
> > >
> > > +Luwei, who developed PT for KVM and is the best one who can help
> > > review VMX changes from Intel side. Please include him in future
> > > post or discussion.
> > >
> > > > -Original Message-
> > > > From: Michał Leszczyński 
> > > > Sent: Wednesday, June 17, 2020 2:48 AM
> > > > To: Andrew Cooper 
> > > > Cc: Xen-devel ; Jan Beulich
> > > > ; Wei Liu ; Roger Pau Monné
> > > > ; Nakajima, Jun ;
> > > > Tian, Kevin ; George Dunlap
> > > > ; Ian Jackson
> > > > ; Julien Grall ;
> > > > Stefano Stabellini 
> > > > Subject: Re: [PATCH v1 0/7] Implement support for external IPT
> > > > monitoring
> > > >
> > > > - 16 cze 2020 o 20:17, Andrew Cooper andrew.coop...@citrix.com
> > > > napisał(a):
> > > >
> > > > > On 16/06/2020 16:16, Michał Leszczyński wrote:
> > > > > When this subject was broached on xen-devel before, one issue
> > > > > was the fact that all actions which are intercepted don't end up
> > > > > writing any appropriate packets.  This is perhaps less of an
> > > > > issue for this example, where the external agent can see VMExits
> > > > > in the trace, but it still results in missing information.  (It
> > > > > is a major problem for PT within the guest, and needs Xen's
> > > > > intercept/emulation framework being updated to be PT-aware so it
> > > > > can fill in the same packets which hardware would have done for
> > > > > equivalent actions.)
> > > >
> > > > Ok, this sounds like a hard issue. Could you point out what could
> > > > be the particular problematic cases? For instance, if something
> > > > would alter EIP/RIP or CR3 then I belive it would still be
> > > > recorded in PT trace (i.e. these values will be logged on VM entry).
> >
> > e.g. If a VM exit is taken on a guest write to CR3 (including “MOV
> > CR3” as well as task switches), the PIP packet normally generated on the CR3
> write will be missing. The PIP packet needs to be written to the PT buffer by
> software. Another example is VM-exit taken on RDTSC.
> >
> > For VM introspection, all the Intel PT packets may need to emulated by
> software. Some description in SDM as below:
> > If a VMM emulates an element of processor state by taking a VM exit on
> reads and/or writes to that piece of state, and the state element impacts 
> Intel
> PT packet generation or values, it may be incumbent upon the VMM to insert
> or modify the output trace data.
> 
> I got the impression that IPT was mostly useful together with introspection, 
> as
> you can then get events from trapped instructions (and likely emulated) from
> the introspection interface, while being able to get the processor trace for 
> non-
> trapped events.
> 
> I'm not sure whether there would be corner cases with trapped instructions
> not being handled by the introspection framework.
> 
> How does KVM deal with this, do they insert/modify trace packets on trapped
> and emulated instructions by the VMM?

The KVM includes instruction decoder and emulator(arch/x86/kvm/emulate.c), and 
the guest's memory can be set to write-protect as well. But it doesn't support 
Intel PT packets software emulator. For KVM, the Intel PT feature will be 
exposed to KVM guest and KVM guest can use Intel PT feature like native.

Thanks,
Luwei Kang


RE: [PATCH v1 0/7] Implement support for external IPT monitoring

2020-06-17 Thread Kang, Luwei
> > > How does KVM deal with this, do they insert/modify trace packets on
> > > trapped and emulated instructions by the VMM?
> >
> > The KVM includes instruction decoder and
> emulator(arch/x86/kvm/emulate.c), and the guest's memory can be set to
> write-protect as well. But it doesn't support Intel PT packets software 
> emulator.
> For KVM, the Intel PT feature will be exposed to KVM guest and KVM guest can
> use Intel PT feature like native.
> 
> But if such feature is exposed to the guest for it's own usage, won't it be
> missing packets for instructions emulated by the VMM?

If setting the guest's memory write-protect, I think yes. 

Thanks,
Luwei Kang

> 
> Thanks, Roger.


RE: [PATCH v1 7/7] x86/vmx: switch IPT MSRs on vmentry/vmexit

2020-06-17 Thread Kang, Luwei
> > On Wed, Jun 17, 2020 at 01:54:45PM +0200, Michał Leszczyński wrote:
> >> - 17 cze 2020 o 11:09, Roger Pau Monné roger@citrix.com napisał(a):
> >>
> >>> 24 Virtual Machine Control Structures -> 24.8 VM-entry Control
> >>> Fields -> 24.8.1 VM-Entry Controls Software should consult the VMX
> capability MSRs IA32_VMX_ENTRY_CTLS to determine how it should set the
> reserved bits.
> >> Please look at bit position 18 "Load IA32_RTIT_CTL".
> > I think this is something different from what I was referring to.
> > Those options you refer to (load/clear IA32_RTIT_CTL) deal with
> > loading/storing a specific field on the vmcs that maps to the guest
> > IA32_RTIT_CTL.
> >
> > OTOH MSR load lists can be used to load and store any arbitrary MSR on
> > vmentry/vmexit, see section 26.4 LOADING MSRS on the SDM. There's
> > already infrastructure on Xen to do so, see vmx_{add/del/find}_msr.
> 
> If I remember the historic roadmaps correctly, there are 3 cases.
> 
> The first hardware to support PT (Broadwell?) prohibited its use completely in
> VMX operations.  In this case, we can use it to trace PV guests iff we don't
> enable VMX in hardware to begin with.
> 
> This was relaxed in later hardware (Skylake?) to permit use within VMX
> operations, but without any help in the VMCS.  (i.e. manual context switching
> per this patch, or MSR load lists as noted in the SDM.)
> 
> Subsequent support for "virtualised PT" was added (IceLake?) which adds the
> load/save controls, and the ability to translate the output buffer under EPT.
> 
> 
> All of this is from memory so I'm quite possibly wrong with details, but I 
> believe
> this is why the current complexity exists.

Yes, It include 3 cases.
1. Before IA32_VMX_MISC[bit 14]:
 Intel PT doesn't support tracing in VMX operation. Execution of the VMXON 
instruction clears IA32_RTIT_CTL.TraceEn and any attempt to write IA32_RTIT_CTL 
in VMX operation causes a general-protection exception (#GP)
2. Support IA32_VMX_MISC[bit 14] but no EPT to direct PT output:
Intel PT can be enabled across VMX but the address of Intel PT buffer is 
always HPA from HW point of view. There is not VMCS support in this stage. The 
MSR load list can be used for Intel PT context switch(VM-Entry/Exit).
3. Intel PT VM improvements (start from Icelake):
Add a new guest IA32_RTIT_CTL field in VMCS, and HW treat the PT output 
addresses as GPA and translate them using EPT.

Thanks,
Luwei Kang