On Tue, Jun 11, 2024 at 11:33:24AM +0200, Jan Beulich wrote:
> On 11.06.2024 11:02, Roger Pau Monné wrote:
> > On Tue, Jun 11, 2024 at 10:26:32AM +0200, Jan Beulich wrote:
> >> On 11.06.2024 09:41, Roger Pau Monné wrote:
> >>> On Mon, Jun 10, 2024 at 04:58:52PM +0200, Jan Beulich wrote:
> >>>> --- a/xen/arch/x86/mm/p2m-ept.c
> >>>> +++ b/xen/arch/x86/mm/p2m-ept.c
> >>>> @@ -503,7 +503,8 @@ int epte_get_entry_emt(struct domain *d,
> >>>>  
> >>>>      if ( !mfn_valid(mfn) )
> >>>>      {
> >>>> -        *ipat = true;
> >>>> +        *ipat = type != p2m_mmio_direct ||
> >>>> +                (!is_iommu_enabled(d) && !cache_flush_permitted(d));
> >>>
> >>> Looking at this, shouldn't the !mfn_valid special case be removed, and
> >>> mfns without a valid page be processed normally, so that the guest
> >>> MTRR values are taken into account, and no iPAT is enforced?
> >>
> >> Such removal is what, in the post commit message remark, I'm referring to
> >> as "moving to too lax". Doing so might be okay, but will imo be hard to
> >> prove to be correct for all possible cases. Along these lines goes also
> >> that I'm adding the IOMMU-enabled and cache-flush checks: In principle
> >> p2m_mmio_direct should not be used when neither of these return true. Yet
> >> a similar consideration would apply to the immediately subsequent if().
> >>
> >> Removing this code would, in particular, result in INVALID_MFN getting a
> >> type of WB by way of the subsequent if(), unless the type there would
> >> also be p2m_mmio_direct (which, as said, it ought to never be for non-
> >> pass-through domains). That again _may_ not be a problem as long as such
> >> EPT entries would never be marked present, yet that's again difficult to
> >> prove.
> > 
> > My understanding is that the !mfn_valid() check was a way to detect
> > MMIO regions in order to exit early and set those to UC.  I however
> > don't follow why the guest MTRR settings shouldn't also be applied to
> > those regions.
> 
> It's unclear to me whether the original purpose of he check really was
> (just) MMIO. It could as well also have been to cover the (then not yet
> named that way) case of INVALID_MFN.
> 
> As to ignoring guest MTRRs for MMIO: I think that's to be on the safe
> side. We don't want guests to map uncachable memory with a cachable
> memory type. Yet control isn't fine grained enough to prevent just
> that. Hence why we force UC, allowing merely to move to WC via PAT.

Would that be to cover up for guests bugs, or there's a coherency
reason for not allowing guests to access memory using fully guest
chosen cache attributes?

I really wonder whether Xen has enough information to figure out
whether a hole (MMIO region) is supposed to be accessed as UC or
something else.

Your proposed patch already allows guest to set such attributes in
PAT, and hence I don't see why also taking guest MTRRs into account
would be any worse.

> > I'm also confused by your comment about "as such EPT entries would
> > never be marked present": non-present EPT entries don't even get into
> > epte_get_entry_emt(), and hence we could assert in epte_get_entry_emt
> > that mfn != INVALID_MFN?
> 
> I don't think we can. Especially for the call from ept_set_entry() I
> can't spot anything that would prevent the call for non-present entries.
> This may be a mistake, but I can't do anything about it right here.

Hm, I see, then we should explicitly handle INVALID_MFN in
epte_get_entry_emt(), and just return early.

> >>> I also think this likely wants a:
> >>>
> >>> Fixes: 81fd0d3ca4b2 ('x86/hvm: simplify 'mmio_direct' check in 
> >>> epte_get_entry_emt()')
> >>
> >> Oh, indeed, I should have dug out when this broke. I didn't because I
> >> knew this mfn_valid() check was there forever, neglecting that it wasn't
> >> always (almost) first.
> >>
> >>> As AFAICT before that commit direct MMIO regions would set iPAT to WB,
> >>> which would result in the correct attributes (albeit guest MTRR was
> >>> still ignored).
> >>
> >> Two corrections here: First iPAT is a boolean; it can't be set to WB.
> >> And then what was happening prior to that change was that for the APIC
> >> access page iPAT was set to true, thus forcing WB there. iPAT was left
> >> set to false for all other p2m_mmio_direct pages, yielding (PAT-
> >> overridable) UC there.
> > 
> > Right, that behavior was still dubious to me, as I would assume those
> > regions would also want to fetch the type from guest MTRRs.
> 
> Well, for the APIC access page we want to prevent it becoming UC. It's MMIO
> from the guest's perspective, yet _we_ know it's really ordinary RAM. For
> actual MMIO see above; the only case where we probably ought to respect
> guest MTRRs is when they say WC (following from what I said further up).
> Yet that's again an independent change to (possibly) make.

For emulated devices we might map regular RAM into what the guest
otherwise thinks it's MMIO.  Maybe the mfn_valid() check should be
inverted, and return WB when the underlying mfn is RAM, and otherwise
use the guest MTRRs to decide the cache attribute?

Thanks, Roger.

Reply via email to