Am 12.11.21 um 17:10 schrieb Michel Dänzer:
On 2021-11-12 16:03, Christian König wrote:
Am 12.11.21 um 15:30 schrieb Michel Dänzer:
On 2021-11-12 15:29, Michel Dänzer wrote:
On 2021-11-12 13:47, Christian König wrote:
Anyway this unfortunately turned out to be work for Harray and Nicholas. In detail it's 
about this bug report here: 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.kernel.org%2Fshow_bug.cgi%3Fid%3D214621&data=04%7C01%7Cchristian.koenig%40amd.com%7Cca557eab16864ab544a108d9a5f6f288%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637723302340621335%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=pvLGq%2FJRvVy0k5GMGF2UPotCSdbiQNfndtjI14luAUg%3D&reserved=0

Lang was able to reproduce the issue and narrow it down to the pin in 
amdgpu_display_crtc_page_flip_target().

In other words we somehow have an unbalanced pinning of the scanout buffer in 
DC.
DC doesn't use amdgpu_display_crtc_page_flip_target AFAICT. The corresponding 
pin with DC would be in dm_plane_helper_prepare_fb, paired with the unpin in
dm_plane_helper_cleanup_fb.


With non-DC, the pin in amdgpu_display_crtc_page_flip_target is paired with the 
unpin in dm_plane_helper_cleanup_fb
This should say amdgpu_display_unpin_work_func.
Ah! So that is the classic (e.g. non atomic) path?
Presumably.


& dce_v*_crtc_disable. One thing I notice is that the pin is guarded by if 
(!adev->enable_virtual_display), but the unpins seem unconditional. So could this 
be about virtual display, and the problem is actually trying to unpin a BO that was 
never pinned?
Nope, my educated guess is rather that we free up the BO before 
amdgpu_display_unpin_work_func is called.

E.g. not pin unbalance, but rather use after free.
amdgpu_display_crtc_page_flip_target calls amdgpu_bo_ref(work->old_abo), and 
amdgpu_display_unpin_work_func calls amdgpu_bo_unref(&work->old_abo) only after 
amdgpu_bo_unpin. So what you describe could only happen if there's an imbalance elsewhere such that 
amdgpu_bo_unref is called more often than amdgpu_bo_ref, or maybe if amdgpu_bo_reserve fails in 
amdgpu_display_unpin_work_func (in which case the "failed to reserve buffer after flip" 
error message should appear in dmesg).

Yeah, seen that in the meantime as well.

But we also have a WARN_ON() when the pincount overruns, so that can't be it either.

Long story short I have no idea what's going on here.

Regards,
Christian.

Reply via email to