Hi, Andy,

On Thu, Sep 23, 2021 at 04:17:05PM -0700, Andy Lutomirski wrote:
> On Mon, Sep 20, 2021, at 12:23 PM, Fenghua Yu wrote:
> > ENQCMD requires the IA32_PASID MSR has a valid PASID value which was
> > allocated to the process during bind. The MSR could be populated eagerly
> > by an IPI after the PASID is allocated in bind. But the method was
> > disabled in commit 9bfecd058339 ("x86/cpufeatures: Force disable
> > X86_FEATURE_ENQCMD and remove update_pasid()")' due to locking and other
> > issues.
> >
> > Since the MSR was cleared in fork()/clone(), the first ENQCMD will
> > generate a #GP fault. The #GP fault handler will initialize the MSR
> > if a PASID has been allocated for this process.
> >
> > The lazy enabling of the PASID MSR in the #GP handler is not an elegant
> > solution. But it has the least complexity that fits with h/w behavior.
> >
> > Signed-off-by: Fenghua Yu <[email protected]>
> > Reviewed-by: Tony Luck <[email protected]>
> > ---
> >  arch/x86/include/asm/fpu/api.h |  6 ++++
> >  arch/x86/include/asm/iommu.h   |  2 ++
> >  arch/x86/kernel/fpu/xstate.c   | 59 ++++++++++++++++++++++++++++++++++
> >  arch/x86/kernel/traps.c        | 12 +++++++
> >  drivers/iommu/intel/svm.c      | 32 ++++++++++++++++++
> >  5 files changed, 111 insertions(+)
> >
> > diff --git a/arch/x86/include/asm/fpu/api.h 
> > b/arch/x86/include/asm/fpu/api.h
> > index ca4d0dee1ecd..f146849e5c8c 100644
> > --- a/arch/x86/include/asm/fpu/api.h
> > +++ b/arch/x86/include/asm/fpu/api.h
> > @@ -106,4 +106,10 @@ extern int cpu_has_xfeatures(u64 xfeatures_mask, 
> > const char **feature_name);
> >   */
> >  #define PASID_DISABLED     0
> > 
> > +#ifdef CONFIG_INTEL_IOMMU_SVM
> > +void fpu__pasid_write(u32 pasid);
> > +#else
> > +static inline void fpu__pasid_write(u32 pasid) { }
> > +#endif
> > +
> >  #endif /* _ASM_X86_FPU_API_H */
> > diff --git a/arch/x86/include/asm/iommu.h b/arch/x86/include/asm/iommu.h
> > index bf1ed2ddc74b..9c4bf9b0702f 100644
> > --- a/arch/x86/include/asm/iommu.h
> > +++ b/arch/x86/include/asm/iommu.h
> > @@ -26,4 +26,6 @@ arch_rmrr_sanity_check(struct acpi_dmar_reserved_memory 
> > *rmrr)
> >     return -EINVAL;
> >  }
> > 
> > +bool __fixup_pasid_exception(void);
> > +
> >  #endif /* _ASM_X86_IOMMU_H */
> > diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
> > index c8def1b7f8fb..8a89b2cecd77 100644
> > --- a/arch/x86/kernel/fpu/xstate.c
> > +++ b/arch/x86/kernel/fpu/xstate.c
> > @@ -1289,3 +1289,62 @@ int proc_pid_arch_status(struct seq_file *m, 
> > struct pid_namespace *ns,
> >     return 0;
> >  }
> >  #endif /* CONFIG_PROC_PID_ARCH_STATUS */
> > +
> > +#ifdef CONFIG_INTEL_IOMMU_SVM
> > +/**
> > + * fpu__pasid_write - Write the current task's PASID state/MSR.
> > + * @pasid: the PASID
> > + *
> > + * The PASID is written to the IA32_PASID MSR directly if the MSR is 
> > active.
> > + * Otherwise it's written to the PASID. The IA32_PASID MSR should 
> > contain
> > + * the PASID after returning to the user.
> > + *
> > + * This is called only when ENQCMD is enabled.
> > + */
> > +void fpu__pasid_write(u32 pasid)
> > +{
> > +   struct xregs_state *xsave = &current->thread.fpu.state.xsave;
> > +   u64 msr_val = pasid | MSR_IA32_PASID_VALID;
> > +   struct fpu *fpu = &current->thread.fpu;
> > +
> > +   /*
> > +    * ENQCMD always uses the compacted XSAVE format. Ensure the buffer
> > +    * has space for the PASID.
> > +    */
> > +   BUG_ON(!(xsave->header.xcomp_bv & XFEATURE_MASK_PASID));
> > +
> > +   fpregs_lock();
> > +
> > +   /*
> > +    * If the task's FPU doesn't need to be loaded or is valid, directly
> > +    * write the IA32_PASID MSR. Otherwise, write the PASID state and
> > +    * the MSR will be loaded from the PASID state before returning to
> > +    * the user.
> > +    */
> > +   if (!test_thread_flag(TIF_NEED_FPU_LOAD) ||
> > +       fpregs_state_valid(fpu, smp_processor_id())) {
> > +           wrmsrl(MSR_IA32_PASID, msr_val);
> 
> Let me try to decode this.
> 
> If the current task's FPU state is live or if the state is in memory but the 
> CPU regs just happen to match the copy in memory, then write the MSR.  Else 
> write the value to memory.
> 
> This is wrong.  If !TIF_NEED_FPU_LOAD && fpregs_state_valid, you MUST NOT 
> MODIFY FPU STATE.

But the FPU state is not modified if !TIF_NEED_FPU_LOAD && fpregs_state_valid.

The FPU state is modified only if TIF_NEED_FPU_LOAD && !fpregs_state_valid.
In this case, the FPU state will be restored to the IA32_PASID MSR when
exiting to the user. In all other cases, the FPU state will be not be
restored on exiting to the user and thus the IA32_PASID MSR is directly
written here.

Is it right?

>  This is not negotiable -- you will break coherence between CPU regs and the 
> memory image.

fpregs_assert_state_consistent() warns on !TIF_NEED_FPU_LOAD &&
!fpregs_state_valid. Is that breaking coherence you are talking?

>  The way you modify the current task's state is either you modify it in CPU 
> regs (if the kernel knows that the CPU regs are the one and only source of 
> truth) OR you modify it in memory and invalidate any preserved copies (by 
> zapping last_cpu). 
> 
> In any event, that particular bit of logic really doesn't belong in here -- 
> it belongs in some helper that gets it right, once.
> 
> > +
> > +/*
> > + * Try to figure out if there is a PASID MSR value to propagate to the
> > + * thread taking the #GP.
> > + */
> > +bool __fixup_pasid_exception(void)
> > +{
> > +   u32 pasid;
> > +
> > +   /*
> > +    * This function is called only when this #GP was triggered from user
> > +    * space. So the mm cannot be NULL.
> > +    */
> > +   pasid = current->mm->pasid;
> > +
> > +   /* If no PASID is allocated, there is nothing to propagate. */
> > +   if (pasid == PASID_DISABLED)
> > +           return false;
> > +
> > +   /*
> > +    * If the current task already has a valid PASID MSR, then the #GP
> > +    * fault must be for some non-ENQCMD related reason.
> > +    */
> > +   if (current->has_valid_pasid)
> > +           return false;
> 
> IMO "has_valid_pasid" is a poor name.  An mm can have a PASID.  A task has 
> noticed or it hasn't.
> 
> If you really need an in-memory cache, call it "pasid_msr_is_loaded".  Or 
> just read the field out of FPU state, but that might be painfully slow.

Agree. Thomas wants to change "has_valid_pasid" to "holds_pasid_ref" to
represents if the task takes a reference to the PASID.

Thanks.

-Fenghua
_______________________________________________
iommu mailing list
[email protected]
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Reply via email to