I think the answer is here:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/vm/hmm.rst#n216
Regards,
Felix
*From:*Koenig, Christian
*Sent:* Thursday, September 27, 2018 10:30 AM
*To:* Kuehling, Felix <felix.kuehl...@amd.com>
*Cc:* j.gli...@gmail.com; Yang, Philip <philip.y...@amd.com>;
amd-gfx@lists.freedesktop.org
*Subject:* RE: [PATCH] drm/amdgpu: use HMM mirror callback to
replace mmu notifier v4
At least with get_user_pages() that is perfectly possible.
For HMM it could be that this is prevented somehow.
Christian.
Am 27.09.2018 16:27 schrieb "Kuehling, Felix"
<felix.kuehl...@amd.com <mailto:felix.kuehl...@amd.com>>:
> In this case you can end up accessing pages which are invalidated while
get_user_pages is in process.
What’s the sequence of events you have in mind? Something like this?
* Page table is updated and triggers MMU notifier
* amdgpu MMU notifier runs and waits for pending CS to finish
while holding the read lock
* New CS starts just after invalidate_range_start MMU notifier
finishes but before the page table update is done
* get_user_pages returns outdated physical addresses
I hope that’s not actually possible and that get_user_pages or
hmm_vma_fault would block until the page table update is done.
That is, invalidate_range_start marks the start of a page table
update, and while that update is in progress, get_user_pages or
hmm_vma_fault block. Jerome, can you comment on that?
Thanks,
Felix
*From:*Koenig, Christian
*Sent:* Thursday, September 27, 2018 9:59 AM
*To:* Kuehling, Felix <felix.kuehl...@amd.com
<mailto:felix.kuehl...@amd.com>>
*Cc:* Yang, Philip <philip.y...@amd.com
<mailto:philip.y...@amd.com>>; amd-gfx@lists.freedesktop.org
<mailto:amd-gfx@lists.freedesktop.org>; Jerome Glisse
<j.gli...@gmail.com <mailto:j.gli...@gmail.com>>
*Subject:* RE: [PATCH] drm/amdgpu: use HMM mirror callback to
replace mmu notifier v4
Yeah I understand that, but again that won't work.
In this case you can end up accessing pages which are invalidated
while get_user_pages is in process.
Christian.
Am 27.09.2018 15:41 schrieb "Kuehling, Felix"
<felix.kuehl...@amd.com <mailto:felix.kuehl...@amd.com>>:
> I’m not planning to change that. I don’t think there is any need to
change it.
>
> Yeah, but when HMM doesn't provide both the start and the end
hock of the invalidation this way won't work any more.
>
> So we need to find a solution for this,
> Christian.
My whole argument is that you don’t need to hold the read lock
until the invalidate_range_end. Just read_lock and read_unlock in
the invalidate_range_start function.
Regards,
Felix
*From:*Koenig, Christian
*Sent:* Thursday, September 27, 2018 9:22 AM
*To:* Kuehling, Felix <felix.kuehl...@amd.com
<mailto:felix.kuehl...@amd.com>>
*Cc:* Yang, Philip <philip.y...@amd.com
<mailto:philip.y...@amd.com>>; amd-gfx@lists.freedesktop.org
<mailto:amd-gfx@lists.freedesktop.org>; Jerome Glisse
<j.gli...@gmail.com <mailto:j.gli...@gmail.com>>
*Subject:* Re: [PATCH] drm/amdgpu: use HMM mirror callback to
replace mmu notifier v4
Am 27.09.2018 um 15:18 schrieb Kuehling, Felix:
> The problem is here:
>
> ttm_eu_fence_buffer_objects(&p->ticket, &p->validated,
p->fence);
> amdgpu_mn_unlock(p->mn);
>
> We need to hold the lock until the fence is added to the
reservation object.
>
> Otherwise somebody could have changed the page tables just
in the moment between the check of
amdgpu_ttm_tt_userptr_needs_pages() and adding the fence to
the reservation object.
I’m not planning to change that. I don’t think there is any
need to change it.
Yeah, but when HMM doesn't provide both the start and the end hock
of the invalidation this way won't work any more.
So we need to find a solution for this,
Christian.
Regards,
Felix
*From:*Koenig, Christian
*Sent:* Thursday, September 27, 2018 7:24 AM
*To:* Kuehling, Felix <felix.kuehl...@amd.com>
<mailto:felix.kuehl...@amd.com>
*Cc:* Yang, Philip <philip.y...@amd.com>
<mailto:philip.y...@amd.com>; amd-gfx@lists.freedesktop.org
<mailto:amd-gfx@lists.freedesktop.org>; Jerome Glisse
<j.gli...@gmail.com> <mailto:j.gli...@gmail.com>
*Subject:* Re: [PATCH] drm/amdgpu: use HMM mirror callback to
replace mmu notifier v4
Am 27.09.2018 um 13:08 schrieb Kuehling, Felix:
> We double check that there wasn't any page table
modification while we prepared the submission and restart
the whole process when there actually was some update.
>
> The reason why we need to do this is here:
>
> ttm_eu_fence_buffer_objects(&p->ticket, &p->validated,
p->fence);
> amdgpu_mn_unlock(p->mn);
>
> Only after the new fence is added to the buffer object
we can release the lock so that any invalidation will now
block on our command submission to finish before it
modifies the page table.
I don’t see why this requires holding the read-lock until
invalidate_range_end. amdgpu_ttm_tt_affect_userptr gets
called while the mn read-lock is held in
invalidate_range_start notifier.
That's not related to amdgpu_ttm_tt_affect_userptr(), this
function could actually be called outside the lock.
The problem is here:
ttm_eu_fence_buffer_objects(&p->ticket, &p->validated,
p->fence);
amdgpu_mn_unlock(p->mn);
We need to hold the lock until the fence is added to the
reservation object.
Otherwise somebody could have changed the page tables just in
the moment between the check of
amdgpu_ttm_tt_userptr_needs_pages() and adding the fence to
the reservation object.
Regards,
Christian.
Regards,
Felix
*From:*Koenig, Christian
*Sent:* Thursday, September 27, 2018 5:27 AM
*To:* Kuehling, Felix <felix.kuehl...@amd.com>
<mailto:felix.kuehl...@amd.com>
*Cc:* Yang, Philip <philip.y...@amd.com>
<mailto:philip.y...@amd.com>;
amd-gfx@lists.freedesktop.org
<mailto:amd-gfx@lists.freedesktop.org>; Jerome Glisse
<j.gli...@gmail.com> <mailto:j.gli...@gmail.com>
*Subject:* Re: [PATCH] drm/amdgpu: use HMM mirror callback
to replace mmu notifier v4
That is correct, but take a look what we do when after
calling the amdgpu_mn_read_lock():
/* No memory allocation is allowed while
holding the mn lock */
amdgpu_mn_lock(p->mn);
amdgpu_bo_list_for_each_userptr_entry(e, p->bo_list) {
struct amdgpu_bo *bo =
ttm_to_amdgpu_bo(e->tv.bo);
if
(amdgpu_ttm_tt_userptr_needs_pages(bo->tbo.ttm)) {
r = -ERESTARTSYS;
goto error_abort;
}
}
We double check that there wasn't any page table
modification while we prepared the submission and restart
the whole process when there actually was some update.
The reason why we need to do this is here:
ttm_eu_fence_buffer_objects(&p->ticket, &p->validated,
p->fence);
amdgpu_mn_unlock(p->mn);
Only after the new fence is added to the buffer object we
can release the lock so that any invalidation will now
block on our command submission to finish before it
modifies the page table.
The only other option would be to add the fence first and
then check if there was any update to the page tables.
The issue with that approach is that adding a fence can't
be made undone, so if we find that there actually was an
update to the page tables we would need to somehow turn
the CS into a dummy (e.g. overwrite all IBs with NOPs or
something like that) and still submit it.
Not sure if that is actually possible.
Regards,
Christian.
Am 27.09.2018 um 10:47 schrieb Kuehling, Felix:
So back to my previous question:
>> But do we really need another lock for this?
Wouldn't the
>> re-validation of userptr BOs (currently calling
get_user_pages) force
>> synchronization with the ongoing page table
invalidation through the
>> mmap_sem or other MM locks?
>
> No and yes. We don't hold any other locks while
doing command submission, but I expect that HMM has
its own mechanism to prevent that.
>
> Since we don't modify
amdgpu_mn_lock()/amdgpu_mn_unlock() we are certainly
not using this mechanism correctly.
The existing amdgpu_mn_lock/unlock should block the
MMU notifier while a command submission is in
progress. It should also block command submission
while an MMU notifier is in progress.
What we lose with HMM is the ability to hold a
read-lock for the entire duration of the
invalidate_range_start until invalidate_range_end. As
I understand it, that lock is meant to prevent new
command submissions while the page tables are being
updated by the kernel. But my point is, that
get_user_pages or hmm_vma_fault should do the same
kind of thing. Before the command submission can go
ahead, it needs to update the userptr addresses. If
the page tables are still being updated, it will block
there even without holding the amdgpu_mn_read_lock.
Regards,
Felix
*From:* Koenig, Christian
*Sent:* Thursday, September 27, 2018 3:00 AM
*To:* Kuehling, Felix <felix.kuehl...@amd.com>
<mailto:felix.kuehl...@amd.com>
*Cc:* Yang, Philip <philip.y...@amd.com>
<mailto:philip.y...@amd.com>;
amd-gfx@lists.freedesktop.org
<mailto:amd-gfx@lists.freedesktop.org>; Jerome Glisse
<j.gli...@gmail.com> <mailto:j.gli...@gmail.com>
*Subject:* RE: [PATCH] drm/amdgpu: use HMM mirror
callback to replace mmu notifier v4
No, that won't work. We would still run into lock
inversion problems.
What we could do with the scheduler is to turn
submissions into dummies if we find that the page
tables are now outdated.
But that would be really hacky and I'm not sure if
that would really work in all cases.
Christian.
Am 27.09.2018 08:53 schrieb "Kuehling, Felix"
<felix.kuehl...@amd.com <mailto:felix.kuehl...@amd.com>>:
I had a chat with Jerome yesterday. He pointed out
that the new blockable parameter can be used to infer
whether the MMU notifier is being called in a reclaim
operation. So if blockable==true, it should even be
safe to take the BO reservation lock without problems.
I think with that we should be able to remove the
read-write locking completely and go back to locking
(or try-locking for blockable==false) the reservation
locks in the MMU notifier?
Regards,
Felix
-----Original Message-----
From: amd-gfx <amd-gfx-boun...@lists.freedesktop.org
<mailto:amd-gfx-boun...@lists.freedesktop.org>> On
Behalf Of Christian König
Sent: Saturday, September 15, 2018 3:47 AM
To: Kuehling, Felix <felix.kuehl...@amd.com
<mailto:felix.kuehl...@amd.com>>; Yang, Philip
<philip.y...@amd.com <mailto:philip.y...@amd.com>>;
amd-gfx@lists.freedesktop.org
<mailto:amd-gfx@lists.freedesktop.org>; Jerome Glisse
<j.gli...@gmail.com <mailto:j.gli...@gmail.com>>
Subject: Re: [PATCH] drm/amdgpu: use HMM mirror
callback to replace mmu notifier v4
Am 14.09.2018 um 22:21 schrieb Felix Kuehling:
> On 2018-09-14 01:52 PM, Christian König wrote:
>> Am 14.09.2018 um 19:47 schrieb Philip Yang:
>>> On 2018-09-14 03:51 AM, Christian König wrote:
>>>> Am 13.09.2018 um 23:51 schrieb Felix Kuehling:
>>>>> On 2018-09-13 04:52 PM, Philip Yang wrote:
>>>>> [SNIP]
>>>>>> + amdgpu_mn_read_unlock(amn);
>>>>>> +
>>>>> amdgpu_mn_read_lock/unlock support recursive
locking for multiple
>>>>> overlapping or nested invalidation ranges. But
if you'r locking
>>>>> and unlocking in the same function. Is that
still a concern?
>>> I don't understand the possible recursive case, but
>>> amdgpu_mn_read_lock() still support recursive locking.
>>>> Well the real problem is that unlocking them here
won't work.
>>>>
>>>> We need to hold the lock until we are sure that
the operation which
>>>> updates the page tables is completed.
>>>>
>>> The reason for this change is because hmm mirror has
>>> invalidate_start callback, no invalidate_end callback
>>>
>>> Check mmu_notifier.c and hmm.c again, below is
entire logic to
>>> update CPU page tables and callback:
>>>
>>> mn lock amn->lock is used to protect interval tree
access because
>>> user may submit/register new userptr anytime.
>>> This is same for old and new way.
>>>
>>> step 2 guarantee the GPU operation is done before
updating CPU page
>>> table.
>>>
>>> So I think the change is safe. We don't need hold
mn lock until the
>>> CPU page tables update is completed.
>> No, that isn't even remotely correct. The lock
doesn't protects the
>> interval tree.
>>
>>> Old:
>>> 1. down_read_non_owner(&amn->lock)
>>> 2. loop to handle BOs from node->bos through
interval tree
>>> amn->object nodes
>>> gfx: wait for pending BOs fence operation
done, mark user
>>> pages dirty
>>> kfd: evict user queues of the process,
wait for queue
>>> unmap/map operation done
>>> 3. update CPU page tables
>>> 4. up_read(&amn->lock)
>>>
>>> New, switch step 3 and 4
>>> 1. down_read_non_owner(&amn->lock)
>>> 2. loop to handle BOs from node->bos through
interval tree
>>> amn->object nodes
>>> gfx: wait for pending BOs fence operation
done, mark user
>>> pages dirty
>>> kfd: evict user queues of the process,
wait for queue
>>> unmap/map operation done
>>> 3. up_read(&amn->lock)
>>> 4. update CPU page tables
>> The lock is there to make sure that we serialize
page table updates
>> with command submission.
> As I understand it, the idea is to prevent command
submission (adding
> new fences to BOs) while a page table invalidation
is in progress.
Yes, exactly.
> But do we really need another lock for this?
Wouldn't the
> re-validation of userptr BOs (currently calling
get_user_pages) force
> synchronization with the ongoing page table
invalidation through the
> mmap_sem or other MM locks?
No and yes. We don't hold any other locks while doing
command submission, but I expect that HMM has its own
mechanism to prevent that.
Since we don't modify
amdgpu_mn_lock()/amdgpu_mn_unlock() we are certainly
not using this mechanism correctly.
Regards,
Christian.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
<mailto:amd-gfx@lists.freedesktop.org>
https://lists.freedesktop.org/mailman/listinfo/amd-gfx