On 22.09.2022 15:42, Marek Marczykowski-Górecki wrote:
> Jürgen: today two grants formats, v1 supports only up to 16TB addresses
>         v2 solves 16TB issue, introduces several more features^Wbugs
>         v2 is 16 bytes per entry, v1 is 8 bytes per entry, v2 more 
> complicated interface to the hypervisor
>         virtio could use per-device grant table, currently virtio iommu 
> device, slow interface
>         v3 could be a grants tree (like iommu page tables), not flat array, 
> separate trees for each grantee
>         could support sharing large pages too
>         easier to have more grants, continuous grant numbers etc
>         two options to distingush trees (from HV PoV):
>         - sharing guest ensure distinct grant ids between (multiple) trees
>         - hv tells guest index under tree got registered
>         v3 can be addition to v1/v2, old used for simpler cases where tree is 
> an overkill
>         hypervisor needs extra memory to keep refcounts - resource allocation 
> discussion

How would refcounts be different from today? Perhaps I don't have a clear
enough picture yet how you envision the tree-like structure(s) to be used.

>         hv could have TLB to speedup mapping
>         issue with v1/v2 - granter cannot revoke pages from uncooperating 
> backend
>         tree could have special page for revoking grants (redirect to that 
> page)
>         special domids, local to the guest, toolstack restaring backend could 
> request to keep the same virtual domid
> Marek:  that requires stateless (or recoverable) protocol, reusing domid 
> currently causes issues
> Andrei: how revoking could work
> Jürgen: there needs to be hypercall, replacing and invalidating mapping (scan 
> page tables?), possibly adjusting IOMMU etc; may fail, problematic for PV

Why would this be problematic for PV only? In principle any
number of mappings of a grant are possible also for PVH/HVM. So
all of them would need finding and replacing. Because of the
multiple mappings, the M2P is of no use here.

While thinking about this I started wondering in how far things
are actually working correctly right now for backends in PVH/HVM:
Any mapping of a grant is handed to p2m_add_page(), which insists
on there being exactly one mapping of any particular MFN, unless
the page is a foreign one. But how does that allow a domain to
map its own grants, e.g. when block-attaching a device locally in
Dom0? Afaict the grant-map would succeed, but the page would be
unmapped from its original GFN.

> Yann:   can backend refuse revoking?
> Jürgen: it shouldn't be this way, but revoke could be controlled by feature 
> flag; revoke could pass scratch page per revoke call (more flexible control)

A single scratch page comes with the risk of data corruption, as all
I/O would be directed there. A sink page (for memory writes) would
likely be okay, but device writes (memory reads) can't be done from
a surrogate page.

> Marek:  what about unmap notification?
> Jürgen: revoke could even be async; ring page for unmap notifications
> 
> Marek:  downgrading mappings (rw -> ro)
> Jürgen: must be careful, to not allow crashing backend
> 
> Jürgen: we should consider interface to mapping large pages ("map this area 
> as a large page if backend shared it as large page")

s/backend/frontend/ I guess?

> Edwin:  what happens when shattering that large page?
> Jürgen: on live migration pages are rebuilt anyway, can reconstruct large 
> pages

If only we did already rebuild large pages ...

Jan

Reply via email to