Hi,
I am writing down here what we discussed on another thread and on IRC.
This will be easier to track in a single thread.
On 04/08/2021 23:00, Julien Grall wrote:
On 04/08/2021 21:56, Oleksandr wrote:
Now, I am wondering, would it be possible to update/clarify the
current "reg" purpose and use it to pass a safe unallocated space for
any Xen specific mappings (grant, foreign, whatever) instead of just
for the grant table region. In case, it is not allowed for any reason
(compatibility PoV, etc), would it be possible to extend a property by
passing an extra range separately, something similar to how I
described above?
I think it should be fine to re-use the same region so long the size of
the first bank is at least the size of the original region.
While answering to the DT binding question on the DT ML, I realized that
this is probably not going to be fine because there is a bug in Xen when
mapping grant-table frame.
The function gnttab_map_frame() is used to map the grant table frame. If
there is an old mapping, it will first remove it.
The function is using the helper gnttab_map_frame() to find the
corresponding GFN or return INVALID_GFN if not mapped.
On Arm, gnttab_map_frame() is implementing using an array index by the
grant table frame number. The trouble is we don't update the array when
the page is unmapped. So if the GFN is re-used before the grant-table is
remapped, then we will end up to remove whatever was mapped there (this
could be a foreign page...).
This behavior already happens today as the toolstack will use the first
GFN of the region if Linux doesn't support the acquire resource
interface. We are getting away in the Linux because the toolstack only
map the first grant table frame and:
- Newer Linux will not used the region provided by the DT and nothing
will be mapped there.
- Older Linux will use the region but still map the grant table frame
0 to the same GFN.
I am not sure about U-boot and other OSes here.
This is not new but it is going to be become a bigger source of problem
(read more chance to hit it) as we try to re-use the first region.
This means the first region should exclusively used for the grant-table
(in a specific order) until the issue is properly fixed.
A potential fix is to update the array in p2m_put_l3_page(). The default
max size of the array is 1024, so it might be fine to just walk it (it
would be simply a comparison).
Note that this is not a problem on x86 because the is using the M2P. So
when a mapping is removed, the mapping MFN -> GFN will also be removed.
Cheers,
--
Julien Grall