On 29/03/22 4:54 pm, Christian König wrote:
> Am 29.03.22 um 13:19 schrieb Arunpravin Paneer Selvam:
>> [SNIP]
>>>> +  pages_left = node->base.num_pages;
>>>>    
>>>>            i = 0;
>>>> -  spin_lock(&mgr->lock);
>>>>            while (pages_left) {
>>>> -          uint32_t alignment = tbo->page_alignment;
>>>> +          if (tbo->page_alignment)
>>>> +                  min_page_size = tbo->page_alignment << PAGE_SHIFT;
>>>> +          else
>>>> +                  min_page_size = mgr->default_page_size;
>>> The handling here looks extremely awkward to me.
>>>
>>> min_page_size should be determined outside of the loop, based on 
>>> default_page_size, alignment and contiguous flag.
>> I kept min_page_size determine logic inside the loop for cases 2GiB+
>> requirements, Since now we are round up the size to the required
>> alignment, I modified the min_page_size determine logic outside of the
>> loop in v12. Please review.
> 
> Ah! So do we only have the loop so that each allocation isn't bigger 
> than 2GiB? If yes couldn't we instead add a max_alloc_size or something 
> similar?
yes we have the loop to limit the allocation not bigger than 2GiB, I
think we cannot avoid the loop since we need to allocate the remaining
pages if any, to complete the 2GiB+ size request. In other words, first
iteration we limit the max size to 2GiB and in the second iteration we
allocate the left over pages if any.
> 
> BTW: I strongly suggest that you rename min_page_size to min_alloc_size. 
> Otherwise somebody could think that those numbers are in pages and not 
> bytes.
modified in v12
> 
>>> Then why do you drop the lock and grab it again inside the loop? And what 
>>> is "i" actually good for?
>> modified the lock/unlock placement in v12.
>>
>> "i" is to track when there is 2GiB+ contiguous allocation request, first
>> we allocate 2GiB (due to SG table limit) continuously and the remaining
>> pages in the next iteration, hence this request can't be a continuous.
>> To set the placement flag we make use of "i" value. In our case "i"
>> value becomes 2 and we don't set the below flag.
>> node->base.placement |= TTM_PL_FLAG_CONTIGUOUS;
>>
>> If we don't get such requests, I will remove "i".
> 
> I'm not sure if that works.
> 
> As far as I can see drm_buddy_alloc_blocks() can allocate multiple 
> blocks at the same time, but i is only incremented when we loop.
> 
> So what you should do instead is to check if node->blocks just contain 
> exactly one element after the allocation but before the trim.
ok
> 
>>>> +
>>>> +          /* Limit maximum size to 2GB due to SG table limitations */
>>>> +          pages = min(pages_left, 2UL << (30 - PAGE_SHIFT));
>>>>    
>>>>                    if (pages >= pages_per_node)
>>>> -                  alignment = pages_per_node;
>>>> -
>>>> -          r = drm_mm_insert_node_in_range(mm, &node->mm_nodes[i], pages,
>>>> -                                          alignment, 0, place->fpfn,
>>>> -                                          lpfn, mode);
>>>> -          if (unlikely(r)) {
>>>> -                  if (pages > pages_per_node) {
>>>> -                          if (is_power_of_2(pages))
>>>> -                                  pages = pages / 2;
>>>> -                          else
>>>> -                                  pages = rounddown_pow_of_two(pages);
>>>> -                          continue;
>>>> -                  }
>>>> -                  goto error_free;
>>>> +                  min_page_size = pages_per_node << PAGE_SHIFT;
>>>> +
>>>> +          if (!is_contiguous && !IS_ALIGNED(pages, min_page_size >> 
>>>> PAGE_SHIFT))
>>>> +                  is_contiguous = 1;
>>>> +
>>>> +          if (is_contiguous) {
>>>> +                  pages = roundup_pow_of_two(pages);
>>>> +                  min_page_size = pages << PAGE_SHIFT;
>>>> +
>>>> +                  if (pages > lpfn)
>>>> +                          lpfn = pages;
>>>>                    }
>>>>    
>>>> -          vis_usage += amdgpu_vram_mgr_vis_size(adev, &node->mm_nodes[i]);
>>>> -          amdgpu_vram_mgr_virt_start(&node->base, &node->mm_nodes[i]);
>>>> -          pages_left -= pages;
>>>> +          BUG_ON(min_page_size < mm->chunk_size);
>>>> +
>>>> +          mutex_lock(&mgr->lock);
>>>> +          r = drm_buddy_alloc_blocks(mm, (u64)place->fpfn << PAGE_SHIFT,
>>>> +                                     (u64)lpfn << PAGE_SHIFT,
>>>> +                                     (u64)pages << PAGE_SHIFT,
>>>> +                                     min_page_size,
>>>> +                                     &node->blocks,
>>>> +                                     node->flags);
>>>> +          mutex_unlock(&mgr->lock);
>>>> +          if (unlikely(r))
>>>> +                  goto error_free_blocks;
>>>> +
>>>>                    ++i;
>>>>    
>>>>                    if (pages > pages_left)
>>>> -                  pages = pages_left;
>>>> +                  pages_left = 0;
>>>> +          else
>>>> +                  pages_left -= pages;
>>>>            }
>>>> -  spin_unlock(&mgr->lock);
>>>>    
>>>> -  if (i == 1)
>>>> +  /* Free unused pages for contiguous allocation */
>>>> +  if (is_contiguous) {
>>> Well that looks really odd, why is trimming not part of
>>> drm_buddy_alloc_blocks() ?
>> we didn't place trim function part of drm_buddy_alloc_blocks since we
>> thought this function can be a generic one and it can be used by any
>> other application as well. For example, now we are using it for trimming
>> the last block in case of size non-alignment with min_page_size.
> 
> Good argument. Another thing I just realized is that we probably want to 
> double check if we only allocated one block before the trim.
ok
> 
> Thanks,
> Christian.
> 

Reply via email to