On 7/23/23 21:24, Taylor R Campbell wrote:
Date: Sun, 23 Jul 2023 12:36:10 +0900
From: PHO <p...@cielonegro.org>
This:
https://github.com/depressed-pho/netbsd-src/blob/91daa67f17222da355d3fddd6fa849c786d9c545/sys/external/bsd/drm2/dist/drm/vmwgfx/vmwgfx_fence.c#L45
and this:
https://github.com/depressed-pho/netbsd-src/blob/91daa67f17222da355d3fddd6fa849c786d9c545/sys/external/bsd/drm2/dist/drm/vmwgfx/vmwgfx_fence.c#L527
vma_fence_manager::fence_list holds pointers to all the fence objects in
use but the list doesn't retain their refcount. I tried to change the
way how it works by simply incrementing the refcount in
vmw_fence_obj_init() and decrementing it in __vmw_fences_update(). But
it didn't work well because fences could also be released by
dma_resv_add_excl_fence() and dma_resv_add_shared_fence(). I cannot
recall all the details but sometimes refcounts went down to zero even if
they were retained by fence_list, and that's when I gave up fighting
against it. I admit I don't fully understand the code.
Why can't the loop just use dma_fence_get_rcu and dma_fence_put?
Might need to release and reacquire the lock around dma_fence_put.
I tried but it didn't work, apparently because someone was holding a
pointer to a dying fence and waiting for it to be signaled. Using
dma_fence_get_rcu() meant we didn't signal such fences, and the kernel
hanged and eventually got killed by heartbeat().
If that doesn't work for some reason and we do have to add a special
case, I'd strongly prefer to add a new front end like
dma_fence_signal_vmwgfxspecialhack or something (other examples in the
tasklet API) that has the relaxed assertion instead of relaxing the
assertions for every other driver which isn't bonkers.
Alright. I'll do that.