amdgpu: MCBP based on DRM scheduler (v6)

Michel Dänzer Wed, 28 Sep 2022 08:07:24 -0700

On 2022-09-28 16:46, Christian König wrote:
> Am 28.09.22 um 15:52 schrieb Michel Dänzer:
>> On 2022-09-28 03:01, Zhu, Jiadong wrote:>
>>> Please make sure umd is calling the libdrm function to create context with 
>>> different priories,
>>> amdgpu_cs_ctx_create2(device_handle, AMDGPU_CTX_PRIORITY_HIGH, 
>>> &context_handle).
>> Yes, I double-checked that, and that it returns success.
>>
>>
>>> Here is the behavior we could see:
>>> 1. After modprobe amdgpu, two software rings named gfx_high/gfx_low(in 
>>> previous patch named gfx_sw) is visible in UMR. We could check the wptr/ptr 
>>> to see if it is being used.
>>> 2. MCBP happens while the two different priority ibs are submitted at the 
>>> same time. We could check fence info as below:
>>> Last signaled trailing fence++  when the mcbp triggers by kmd. Last 
>>> preempted may not increase as the mcbp is not triggered from CP.
>>>
>>> --- ring 0 (gfx) ---
>>> Last signaled fence          0x00000001
>>> Last emitted                 0x00000001
>>> Last signaled trailing fence 0x0022eb84
>>> Last emitted                 0x0022eb84
>>> Last preempted               0x00000000
>>> Last reset                   0x00000000
>> I've now tested on this Picasso (GFX9) laptop as well. The "Last signaled 
>> trailing fence" line is changing here (seems to always match the "Last 
>> emitted" line).
>>
>> However, mutter's frame rate still cannot exceed that of GPU-limited 
>> clients. BTW, you can test with a GNOME Wayland session, even without my MR 
>> referenced below. Preemption will just be less effective without that MR. 
>> mutter has used a high priority context when possible for a long time.
>>
>> I'm also seeing intermittent freezes, where not even the mouse cursor moves 
>> for up to around one second, e.g. when interacting with the GNOME top bar. 
>> I'm not sure yet if these are related to this patch series, but I never 
>> noticed it before. I wonder if the freezes might occur when GPU preemption 
>> is attempted.
> 
> Keep in mind that this doesn't have the same fine granularity as the separate 
> hw ring buffer found on gfx10.
> 
> With MCBP we can only preempt on draw command boundary, while the separate hw 
> ring solution can preempt as soon as a CU is available.


Right, but so far I haven't noticed any positive effect. That and the 
intermittent freezes indicate the MCBP based preemption isn't actually working 
as intended yet.


>>> From: Koenig, Christian <christian.koe...@amd.com>
>>>
>>>> This work is solely for gfx9 (e.g. Vega) and older.
>>>>
>>>> Navi has a completely separate high priority gfx queue we can use for this.
>> Right, but 4c7631800e6b ("drm/amd/amdgpu: add pipe1 hardware support") was 
>> for Sienna Cichlid only, and turned out to be unstable, so it had to 
>> reverted.
>>
>> It would be nice to make the SW ring solution take effect by default 
>> whenever there is no separate high priority HW gfx queue available (and any 
>> other requirements are met).
> 
> I don't think that this will be a good idea. The hw ring buffer or even hw 
> scheduler have much nicer properties and we should focus on getting that 
> working when available.

Of course, the HW features should have priority. I mean as a fallback when the 
HW features effectively aren't available (which is currently always the case 
with amdgpu, even when the GPU has the HW features).


>>> Am 27.09.22 um 19:49 schrieb Michel Dänzer:
>>>> On 2022-09-27 08:06, Christian König wrote:
>>>>> Hey Michel,
>>>>>
>>>>> JIadong is working on exposing high/low priority gfx queues for gfx9 and 
>>>>> older hw generations by using mid command buffer preemption.
>>>> Yeah, I've been keeping an eye on these patches. I'm looking forward to 
>>>> this working.
>>>>
>>>>
>>>>> I know that you have been working on Gnome Mutter to make use from 
>>>>> userspace for this. Do you have time to run some tests with that?
>>>> I just tested the v8 series (first without amdgpu.mcbp=1 on the kernel 
>>>> command line, then with it, since I wasn't sure if it's needed) with 
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.gnome.org%2FGNOME%2Fmutter%2F-%2Fmerge_requests%2F1880&amp;data=05%7C01%7Cchristian.koenig%40amd.com%7Cc6345d9230004549ba4d08daa0b0abcd%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637998977913548768%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=P1Qo2AwDmfmPrxJe2SxTFsVjdJ9vjabK8s84ZVz%2Beh8%3D&amp;reserved=0
>>>>  on Navi 14.
>>>>
>>>> Unfortunately, I'm not seeing any change in behaviour. Even though mutter 
>>>> uses a high priority context via the EGL_IMG_context_priority extension, 
>>>> it's unable to reach a higher frame rate than a GPU-limited client[0]. The 
>>>> "Last preempted" line of /sys/kernel/debug/dri/0/amdgpu_fence_info remains 
>>>> at 0x00000000.
>>>>
>>>> Did I miss a step?
>>>>
>>>>
>>>> [0] I used the GpuTest pixmark piano & plot3d benchmarks. With an Intel 
>>>> iGPU, mutter can achieve a higher frame rate than plot3d, though not than 
>>>> pixmark piano (presumably due to limited GPU preemption granularity).
>>
> 

-- 
Earthling Michel Dänzer            |                  https://redhat.com
Libre software enthusiast          |         Mesa and Xwayland developer

Re: [PATCH 4/4] drm/amdgpu: MCBP based on DRM scheduler (v6)

Reply via email to