On 29/03/22 9:30 pm, Arunpravin Paneer Selvam wrote:
> 
> 
> On 29/03/22 4:54 pm, Christian König wrote:
>> Am 29.03.22 um 13:19 schrieb Arunpravin Paneer Selvam:
>>> [SNIP]
>>>>> + pages_left = node->base.num_pages;
>>>>>    
>>>>>           i = 0;
>>>>> - spin_lock(&mgr->lock);
>>>>>           while (pages_left) {
>>>>> -         uint32_t alignment = tbo->page_alignment;
>>>>> +         if (tbo->page_alignment)
>>>>> +                 min_page_size = tbo->page_alignment << PAGE_SHIFT;
>>>>> +         else
>>>>> +                 min_page_size = mgr->default_page_size;
>>>> The handling here looks extremely awkward to me.
>>>>
>>>> min_page_size should be determined outside of the loop, based on 
>>>> default_page_size, alignment and contiguous flag.
>>> I kept min_page_size determine logic inside the loop for cases 2GiB+
>>> requirements, Since now we are round up the size to the required
>>> alignment, I modified the min_page_size determine logic outside of the
>>> loop in v12. Please review.
>>
>> Ah! So do we only have the loop so that each allocation isn't bigger 
>> than 2GiB? If yes couldn't we instead add a max_alloc_size or something 
>> similar?
> yes we have the loop to limit the allocation not bigger than 2GiB, I
> think we cannot avoid the loop since we need to allocate the remaining
> pages if any, to complete the 2GiB+ size request. In other words, first
> iteration we limit the max size to 2GiB and in the second iteration we
> allocate the left over pages if any.

Hi Christian,

Here my understanding might be incorrect, should we limit the max size =
2GiB and skip all the remaining pages for a 2GiB+ request?

Thanks,
Arun
>>
>> BTW: I strongly suggest that you rename min_page_size to min_alloc_size. 
>> Otherwise somebody could think that those numbers are in pages and not 
>> bytes.
> modified in v12
>>
>>>> Then why do you drop the lock and grab it again inside the loop? And what 
>>>> is "i" actually good for?
>>> modified the lock/unlock placement in v12.
>>>
>>> "i" is to track when there is 2GiB+ contiguous allocation request, first
>>> we allocate 2GiB (due to SG table limit) continuously and the remaining
>>> pages in the next iteration, hence this request can't be a continuous.
>>> To set the placement flag we make use of "i" value. In our case "i"
>>> value becomes 2 and we don't set the below flag.
>>> node->base.placement |= TTM_PL_FLAG_CONTIGUOUS;
>>>
>>> If we don't get such requests, I will remove "i".
>>
>> I'm not sure if that works.
>>
>> As far as I can see drm_buddy_alloc_blocks() can allocate multiple 
>> blocks at the same time, but i is only incremented when we loop.
>>
>> So what you should do instead is to check if node->blocks just contain 
>> exactly one element after the allocation but before the trim.
> ok
>>
>>>>> +
>>>>> +         /* Limit maximum size to 2GB due to SG table limitations */
>>>>> +         pages = min(pages_left, 2UL << (30 - PAGE_SHIFT));
>>>>>    
>>>>>                   if (pages >= pages_per_node)
>>>>> -                 alignment = pages_per_node;
>>>>> -
>>>>> -         r = drm_mm_insert_node_in_range(mm, &node->mm_nodes[i], pages,
>>>>> -                                         alignment, 0, place->fpfn,
>>>>> -                                         lpfn, mode);
>>>>> -         if (unlikely(r)) {
>>>>> -                 if (pages > pages_per_node) {
>>>>> -                         if (is_power_of_2(pages))
>>>>> -                                 pages = pages / 2;
>>>>> -                         else
>>>>> -                                 pages = rounddown_pow_of_two(pages);
>>>>> -                         continue;
>>>>> -                 }
>>>>> -                 goto error_free;
>>>>> +                 min_page_size = pages_per_node << PAGE_SHIFT;
>>>>> +
>>>>> +         if (!is_contiguous && !IS_ALIGNED(pages, min_page_size >> 
>>>>> PAGE_SHIFT))
>>>>> +                 is_contiguous = 1;
>>>>> +
>>>>> +         if (is_contiguous) {
>>>>> +                 pages = roundup_pow_of_two(pages);
>>>>> +                 min_page_size = pages << PAGE_SHIFT;
>>>>> +
>>>>> +                 if (pages > lpfn)
>>>>> +                         lpfn = pages;
>>>>>                   }
>>>>>    
>>>>> -         vis_usage += amdgpu_vram_mgr_vis_size(adev, &node->mm_nodes[i]);
>>>>> -         amdgpu_vram_mgr_virt_start(&node->base, &node->mm_nodes[i]);
>>>>> -         pages_left -= pages;
>>>>> +         BUG_ON(min_page_size < mm->chunk_size);
>>>>> +
>>>>> +         mutex_lock(&mgr->lock);
>>>>> +         r = drm_buddy_alloc_blocks(mm, (u64)place->fpfn << PAGE_SHIFT,
>>>>> +                                    (u64)lpfn << PAGE_SHIFT,
>>>>> +                                    (u64)pages << PAGE_SHIFT,
>>>>> +                                    min_page_size,
>>>>> +                                    &node->blocks,
>>>>> +                                    node->flags);
>>>>> +         mutex_unlock(&mgr->lock);
>>>>> +         if (unlikely(r))
>>>>> +                 goto error_free_blocks;
>>>>> +
>>>>>                   ++i;
>>>>>    
>>>>>                   if (pages > pages_left)
>>>>> -                 pages = pages_left;
>>>>> +                 pages_left = 0;
>>>>> +         else
>>>>> +                 pages_left -= pages;
>>>>>           }
>>>>> - spin_unlock(&mgr->lock);
>>>>>    
>>>>> - if (i == 1)
>>>>> + /* Free unused pages for contiguous allocation */
>>>>> + if (is_contiguous) {
>>>> Well that looks really odd, why is trimming not part of
>>>> drm_buddy_alloc_blocks() ?
>>> we didn't place trim function part of drm_buddy_alloc_blocks since we
>>> thought this function can be a generic one and it can be used by any
>>> other application as well. For example, now we are using it for trimming
>>> the last block in case of size non-alignment with min_page_size.
>>
>> Good argument. Another thing I just realized is that we probably want to 
>> double check if we only allocated one block before the trim.
> ok
>>
>> Thanks,
>> Christian.
>>

Reply via email to