On 09.08.21 17:51, Julien Grall wrote:
Hi,
Hi Julien, all
I am writing down here what we discussed on another thread and on IRC.
This will be easier to track in a single thread.
On 04/08/2021 23:00, Julien Grall wrote:
On 04/08/2021 21:56, Oleksandr wrote:
Now, I am wondering, would it be possible to update/clarify the
current "reg" purpose and use it to pass a safe unallocated space
for any Xen specific mappings (grant, foreign, whatever) instead of
just for the grant table region. In case, it is not allowed for any
reason (compatibility PoV, etc), would it be possible to extend a
property by passing an extra range separately, something similar to
how I described above?
I think it should be fine to re-use the same region so long the size
of the first bank is at least the size of the original region.
While answering to the DT binding question on the DT ML, I realized
that this is probably not going to be fine because there is a bug in
Xen when mapping grant-table frame.
The function gnttab_map_frame() is used to map the grant table frame.
If there is an old mapping, it will first remove it.
The function is using the helper gnttab_map_frame() to find the
corresponding GFN or return INVALID_GFN if not mapped.
On Arm, gnttab_map_frame() is implementing using an array index by the
grant table frame number. The trouble is we don't update the array
when the page is unmapped. So if the GFN is re-used before the
grant-table is remapped, then we will end up to remove whatever was
mapped there (this could be a foreign page...).
This behavior already happens today as the toolstack will use the
first GFN of the region if Linux doesn't support the acquire resource
interface. We are getting away in the Linux because the toolstack only
map the first grant table frame and:
- Newer Linux will not used the region provided by the DT and nothing
will be mapped there.
- Older Linux will use the region but still map the grant table frame
0 to the same GFN.
I am not sure about U-boot and other OSes here.
This is not new but it is going to be become a bigger source of
problem (read more chance to hit it) as we try to re-use the first
region.
This means the first region should exclusively used for the
grant-table (in a specific order) until the issue is properly fixed.
A potential fix is to update the array in p2m_put_l3_page(). The
default max size of the array is 1024, so it might be fine to just
walk it (it would be simply a comparison).
Note that this is not a problem on x86 because the is using the M2P.
So when a mapping is removed, the mapping MFN -> GFN will also be
removed.
Cheers,
The fix is already pushed:
https://lore.kernel.org/xen-devel/1628890077-12545-1-git-send-email-olekst...@gmail.com/
Thanks!
--
Regards,
Oleksandr Tyshchenko