amdgpu: restructure VM state machine

Christian König Mon, 16 Mar 2026 06:45:14 -0700

Hi,

On 3/12/26 15:32, Tvrtko Ursulin wrote:
> 
> On 10/03/2026 19:13, Christian König wrote:
>> Instead of comming up with more sophisticated names for states a VM BO
> 
> s/comming/coming/, and maybe a comma after "in" in the row below.
> 
>> can be in group them by the type of BO first and then by the state.
>>
>> So we end with BO type kernel, shared_resv and individual_resv and then
>> states evicted, moved and idle.
>>
>> Not much functional change, except that evicted_user is moved back
>> together with the other BOs again which makes the handling in
>> amdgpu_vm_validate() a bit more complex. Also fixes a problem with user
>> queues and amdgpu_vm_ready().
> 
> It would be good to describe the problem at least a little bit, especially if 
> it was a significant reason for the patch.


Thanks for the comments, fixed in the next version.

>> -static void amdgpu_vm_bo_invalidated(struct amdgpu_vm_bo_base *vm_bo)
>> +/* Eventually unlock the status list lock again */
>> +static void amdgpu_vm_bo_unlock_lists(struct amdgpu_vm_bo_base *vm_bo)
>>   {
>> -    spin_lock(&vm_bo->vm->invalidated_lock);
>> -    list_move(&vm_bo->vm_status, &vm_bo->vm->invalidated);
>> -    spin_unlock(&vm_bo->vm->invalidated_lock);
>> +    if (!amdgpu_vm_is_bo_always_valid(vm_bo->vm, vm_bo->bo))
>> +        spin_unlock(&vm_bo->vm->invalidated_lock);
> 
> Worth putting an assert vm is locked on the else path? Could be given the new 
> API (amdgpu_vm_bo_lock_lists) is a bit odd in that for always valid it 
> expects vm already locked and for other takes a different lock.

Good point, fixed as well.

>> @@ -412,14 +384,16 @@ void amdgpu_vm_bo_base_init(struct amdgpu_vm_bo_base 
>> *base,
>>       amdgpu_vm_update_stats_locked(base, bo->tbo.resource, +1);
>>       spin_unlock(&vm->stats_lock);
>>   -    if (!amdgpu_vm_is_bo_always_valid(vm, bo))
>> +    if (!amdgpu_vm_is_bo_always_valid(vm, bo)) {
>> +        amdgpu_vm_bo_idle(base);
> 
> Hm on what list is it today?

None, the status member was just initialized with INIT_LIST_HEAD().

Since it doesn't has any mappings yet putting it on the idle list sounds 
perfectly valid to me.

>>   int amdgpu_vm_lock_done_list(struct amdgpu_vm *vm, struct drm_exec *exec,
>>                    unsigned int num_fences)
>>   {
>> -    struct list_head *prev = &vm->done;
>> +    struct list_head *prev = &vm->individual_resv.idle;
> 
> Should this access be under the lock?

No, prev is the pointer which is always valid. In this case here pointing to 
the list head.

Only prev->next can only be trusted while holding the spinlock. That is also 
documented by the comment below.

>>       struct amdgpu_bo_va *bo_va;
>>       struct amdgpu_bo *bo;
>>       int ret;
>>         /* We can only trust prev->next while holding the lock */
>>       spin_lock(&vm->invalidated_lock);
>> -    while (!list_is_head(prev->next, &vm->done)) {
>> +    while (!list_is_head(prev->next, &vm->individual_resv.idle)) {
>>           bo_va = list_entry(prev->next, typeof(*bo_va), base.vm_status);
>>             bo = bo_va->base.bo;
>> @@ -584,7 +558,6 @@ int amdgpu_vm_validate(struct amdgpu_device *adev, 
>> struct amdgpu_vm *vm,
>>   {
>>       uint64_t new_vm_generation = amdgpu_vm_generation(adev, vm);
>>       struct amdgpu_vm_bo_base *bo_base, *tmp;
>> -    struct amdgpu_bo *bo;
>>       int r;
>>         if (vm->generation != new_vm_generation) {
>> @@ -596,38 +569,52 @@ int amdgpu_vm_validate(struct amdgpu_device *adev, 
>> struct amdgpu_vm *vm,
>>               return r;
>>       }
>>   -    list_for_each_entry_safe(bo_base, tmp, &vm->evicted, vm_status) {
>> -        bo = bo_base->bo;
>> -
>> -        r = validate(param, bo);
>> +    list_for_each_entry_safe(bo_base, tmp, &vm->kernel.evicted, vm_status) {
>> +        r = validate(param, bo_base->bo);
>>           if (r)
>>               return r;
>>   -        if (bo->tbo.type != ttm_bo_type_kernel) {
>> -            amdgpu_vm_bo_moved(bo_base);
>> -        } else {
>> -            vm->update_funcs->map_table(to_amdgpu_bo_vm(bo));
>> -            amdgpu_vm_bo_relocated(bo_base);
>> -        }
>> +        vm->update_funcs->map_table(to_amdgpu_bo_vm(bo_base->bo));
>> +        amdgpu_vm_bo_moved(bo_base);
>>       }
>>   -    if (ticket) {
>> -        list_for_each_entry_safe(bo_base, tmp, &vm->evicted_user,
>> -                     vm_status) {
>> -            bo = bo_base->bo;
>> -            dma_resv_assert_held(bo->tbo.base.resv);
>> +    amdgpu_vm_eviction_lock(vm);
>> +    vm->evicting = false;
>> +    amdgpu_vm_eviction_unlock(vm);
> 
> Is there a specific reason this block is right here and not at the end as 
> today?

The evicting status reflects if the page tables are at the location where they 
should be for an updated (VRAM/GTT, accessible by the GPU).

The status of the always_valid/per VM BOs doesn't matter for that.

This has caused a few problems for userqueues and is the issue I described in 
the commit message.

> 
>>   -            r = validate(param, bo);
>> -            if (r)
>> -                return r;
>> +    list_for_each_entry_safe(bo_base, tmp, &vm->shared_resv.evicted,
>> +                 vm_status) {
>> +        r = validate(param, bo_base->bo);
>> +        if (r)
>> +            return r;
>>   -            amdgpu_vm_bo_invalidated(bo_base);
>> -        }
>> +        amdgpu_vm_bo_moved(bo_base);
>>       }
>>   -    amdgpu_vm_eviction_lock(vm);
>> -    vm->evicting = false;
>> -    amdgpu_vm_eviction_unlock(vm);
>> +    if (!ticket)
>> +        return 0;
>> +
>> +    spin_lock(&vm->invalidated_lock);
>> +    list_for_each_entry(bo_base, &vm->individual_resv.evicted, vm_status) {
>> +        struct amdgpu_bo *bo = bo_base->bo;
>> +
>> +        if (dma_resv_locking_ctx(bo->tbo.base.resv) != ticket)
>> +            continue;
> 
> What is this for?

Only certain elements on the list are locked, we skip the ones which aren't.

> 
>> +
>> +        spin_unlock(&vm->invalidated_lock);
>> +
>> +        r = validate(param, bo);
>> +        if (r)
>> +            return r;
>> +
>> +        /* need to grab the invalidated lock to trust prev here */
>> +        spin_lock(&vm->invalidated_lock);
>> +        tmp = list_entry(bo_base->vm_status.prev, typeof(*tmp),
>> +                 vm_status);
> 
> Why is this safe? Lock was dropped while the current element was left in the 
> list so anything could have happened. Even if current element was unlinked 
> before dropping the lock, I don't see how it is safe to assume the previous 
> element is still valid. Does it rely on dma-resve being held over the whole 
> function? The current method of restarting from the head is certainly easier 
> to understand.

The whole list is protected by the spinlock, but individual elements can only 
move while holding their resv lock.

Now what we do is to walk the list until we find one where the resv lock is 
locked individually, so that we know that it is save to drop the spinlock.

We then validate the entry we found and try restart from the entry before the 
one we found.

Going over the list again until we can't find any more is probably easier to 
understand but also less effective.

>>   -    seq_puts(m, "\tRelocated BOs:\n");
>> -    list_for_each_entry_safe(bo_va, tmp, &vm->relocated, base.vm_status) {
>> -        if (!bo_va->base.bo)
>> -            continue;
>> -        total_relocated += amdgpu_bo_print_info(id++, bo_va->base.bo, m);
>> +        amdgpu_bo_print_info(id++, base->bo, m);
> 
> Probably just thinking out loud - given the format of the debugfs file is 
> changing anyway, and that this id is both unstable cat-to-cat and also has no 
> relation to the user handle which is output by the only other caller so could 
> be confusing/misleading, I wonder if it is even worth bothering with it. For 
> example you could pass zero and even modify (or not, optional) 
> amdgpu_bo_print_info to skip it if handle is zero.

Yeah I was thinking the same thing. The id is really completely pointless.

But I think that is for a different patch.

>>        *
>>        * Lists are protected by the invalidated_lock.
>>        */
>>       spinlock_t        invalidated_lock;
>>   -    /* BOs for user mode queues that need a validation */
>> -    struct list_head    evicted_user;
>> -
>> -    /* regular invalidated BOs, but not yet updated in the PT */
>> -    struct list_head    invalidated;
>> -
>> -    /* BOs which are invalidated, has been updated in the PTs */
>> -    struct list_head        done;
>> +    /* Userspace BOs with individual resv object */
>> +    struct amdgpu_vm_bo_status    individual_resv;
> 
> Are all state transitions valid for all lists? If not it would be good to put 
> that info in the respective comments.

Yes they are.

Thanks for the review,
Christian.

> 
>>         /*
>>        * This list contains amdgpu_bo_va_mapping objects which have been 
>> freed
> 
> Regards,
> 
> Tvrtko

Re: [PATCH 02/11] drm/amdgpu: restructure VM state machine

Reply via email to