On 06/02/2020 12:57, Jan Beulich wrote:
> On 06.02.2020 12:44, Durrant, Paul wrote:
>>> -----Original Message-----
>>> From: Julien Grall <jul...@xen.org>
>>> Sent: 06 February 2020 11:17
>>> To: Durrant, Paul <pdurr...@amazon.co.uk>; xen-devel@lists.xenproject.org
>>> Cc: Grall, Julien <jgr...@amazon.com>
>>> Subject: Re: [PATCH v2] xen/mm: Avoid assuming the page is inuse in
>>> assign_pages()
>>>
>>> Hi Paul,
>>>
>>> On 06/02/2020 10:52, Durrant, Paul wrote:
>>>>> -----Original Message-----
>>>>> From: Julien Grall <jul...@xen.org>
>>>>> Sent: 06 February 2020 10:39
>>>>> To: xen-devel@lists.xenproject.org
>>>>> Cc: jul...@xen.org; Durrant, Paul <pdurr...@amazon.co.uk>; Grall,
>>> Julien
>>>>> <jgr...@amazon.com>
>>>>> Subject: [PATCH v2] xen/mm: Avoid assuming the page is inuse in
>>>>> assign_pages()
>>>>>
>>>>> From: Julien Grall <jgr...@amazon.com>
>>>>>
>>>>> At the moment, assign_pages() on the page to be inuse (PGC_state_inuse)
>>>>> and the state value to be 0.
>>>>>
>>>>> However, the code may race with the page offlining code (see
>>>>> offline_page()). Depending on the ordering, the page may be in
>>> offlining
>>>>> state (PGC_state_offlining) before it is assigned to a domain.
>>>>>
>>>>> On debug build, this may result to hit the assert or just clobber the
>>>>> state. On non-debug build, the state will get clobbered.
>>>>>
>>>>> Incidentally the flag PGC_broken will get clobbered as well.
>>>>>
>>>>> Grab the heap_lock to prevent a race with offline_page() and keep the
>>>>> state and broken flag around.
>>>>>
>>>>> Signed-off-by: Julien Grall <jgr...@amazon.com>
>>>> This seems like a reasonable change. I guess having assign_pages() take
>>> the global lock is no more problem than its existing call to
>>> domain_adjust_tot_pages() which also takes the same lock.
>>>
>>> That's my understanding. Summarizing our discussion IRL for the other,
>>> it is not clear whether the lock is enough here.
>>>
>>>  From my understanding the sequence
>>>
>>> pg[i].count_info &= ...;
>>> pg[i].count_info |= ...;
>>>
>>> could result to multiple read/write from the compiler. We could use a
>>> single assignment, but I still don't think this prevent the compiler to
>>> be use multiple read/write.
>>>
>>> The concern would be a race with get_page_owner_and_reference(). If 1 is
>>> set before the rest of the bits, then you may be able to get the page.
>>>
>>> So I might want to use write_atomic() below. Any opinion?
>>>
>> TBH I wonder if we ought to say that any update to count_info ought to
>> be done by a write_atomic (where it's not already done by cmpxchg).
> I agree.

It won't fix anything, and gives the compiler a harder time.

write_atomic() is a mov instruction.  It prohibits the use of and/or
$imm, mem encodings.

If multiple reads/writes are a concern then the only valid code
generation is for *every* modification of count info to use locked
operations.

Swapping regular C for a single mov instruction here is not going to fix
a bug, if such a bug exists.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Reply via email to