Op 14-10-2021 om 15:56 schreef Tvrtko Ursulin: > > On 14/10/2021 14:45, Maarten Lankhorst wrote: >> Op 14-10-2021 om 15:25 schreef Tvrtko Ursulin: >>> >>> On 14/10/2021 13:05, Maarten Lankhorst wrote: >>>> Op 14-10-2021 om 10:37 schreef Tvrtko Ursulin: >>>>> >>>>> On 13/10/2021 11:41, Maarten Lankhorst wrote: >>>>>> No memory should be allocated when calling i915_gem_object_wait, >>>>>> because it may be called to idle a BO when evicting memory. >>>>>> >>>>>> Fix this by using dma_resv_iter helpers to call >>>>>> i915_gem_object_wait_fence() on each fence, which cleans up the code a >>>>>> lot. >>>>>> Also remove dma_resv_prune, it's questionably. >>>>>> >>>>>> This will result in the following lockdep splat. >>>>> >>>>> <snip> >>>>> >>>>>> @@ -37,56 +36,17 @@ i915_gem_object_wait_reservation(struct dma_resv >>>>>> *resv, >>>>>> unsigned int flags, >>>>>> long timeout) >>>>>> { >>>>>> - struct dma_fence *excl; >>>>>> - bool prune_fences = false; >>>>>> - >>>>>> - if (flags & I915_WAIT_ALL) { >>>>>> - struct dma_fence **shared; >>>>>> - unsigned int count, i; >>>>>> - int ret; >>>>>> + struct dma_resv_iter cursor; >>>>>> + struct dma_fence *fence; >>>>>> - ret = dma_resv_get_fences(resv, &excl, &count, &shared); >>>>>> - if (ret) >>>>>> - return ret; >>>>>> - >>>>>> - for (i = 0; i < count; i++) { >>>>>> - timeout = i915_gem_object_wait_fence(shared[i], >>>>>> - flags, timeout); >>>>>> - if (timeout < 0) >>>>>> - break; >>>>>> + dma_resv_iter_begin(&cursor, resv, flags & I915_WAIT_ALL); >>>>>> + dma_resv_for_each_fence_unlocked(&cursor, fence) { >>>>>> - dma_fence_put(shared[i]); >>>>>> - } >>>>>> - >>>>>> - for (; i < count; i++) >>>>>> - dma_fence_put(shared[i]); >>>>>> - kfree(shared); >>>>>> - >>>>>> - /* >>>>>> - * If both shared fences and an exclusive fence exist, >>>>>> - * then by construction the shared fences must be later >>>>>> - * than the exclusive fence. If we successfully wait for >>>>>> - * all the shared fences, we know that the exclusive fence >>>>>> - * must all be signaled. If all the shared fences are >>>>>> - * signaled, we can prune the array and recover the >>>>>> - * floating references on the fences/requests. >>>>>> - */ >>>>>> - prune_fences = count && timeout >= 0; >>>>>> - } else { >>>>>> - excl = dma_resv_get_excl_unlocked(resv); >>>>>> + timeout = i915_gem_object_wait_fence(fence, flags, timeout); >>>>>> + if (timeout <= 0) >>>>>> + break; >>>>> >>>>> You have another change in behaviour here, well a bug really. When >>>>> userspace passes in zero timeout you fail to report activity in other >>>>> than the first fence. >>>> >>>> Hmm, not necessarily, passing 0 to i915_gem_object_wait_fence timeout = 0 >>>> is a special case and means test only. It will return 1 on success. >>> >>> I tried to enumerate the whole chain here. All for timeout == 0. Please >>> double check I did not make a mistake somewhere since there are many return >>> code inversions here. >>> >>> As building blocks for the whole "game" we have: >>> >>> 1. dma_fence_default_wait, it returns for states: >>> not signaled -> 0 >>> signaled -> 1 >>> >>> 2. i915_request_wait >>> >>> not signaled -> -ETIME >>> signaled -> 0 >>> >>> Then i915_gem_object_wait_fence builds on top of it and has therefore these >>> possible outputs: >>> >>> signaled -> 0 >>> not signaled: >>> i915 path -> -ETIME >>> ext fence -> 0 >>> >>> So this looks a like problem already with 0 for signaled and not signaled. >>> Unless it is by design that the return value does not want to report >>> external fences? But it is not documented and it still waits on them so odd. >>> >>> Then in i915_gem_object_wait_reservation we have a loop: >>> >>> for (i = 0; i < count; i++) { >>> timeout = i915_gem_object_wait_fence(shared[i], >>> flags, timeout); >>> if (timeout < 0) >>> break; >>> >>> So short circuit happens only for i915 fences, by virtue of no negative >>> return codes otherwise. >>> >>> If we focus for i915 fences only for a moment. It means it keeps skipping >>> signaled to check if any is not, therefore returning -ETIME if any is not >>> signaled. i915_gem_object_wait passes the negative return on. >>> >>> With your patch you have: >>> >>> + timeout = i915_gem_object_wait_fence(fence, flags, timeout); >>> + if (timeout <= 0) >>> + break; >>> >>> Which means you break on first signaled fence (i915 or external), therefore >>> missing to report any possible subsequent unsignaled fences. So gem_wait >>> ioctl breaks unless I am missing something. >> >> You're cc'd on a mail I sent to König regarding this. >> "Re: [PATCH 20/28] drm/i915: use new iterator in >> i915_gem_object_wait_reservation" >> 5accca25-8ac3-47ca-ee56-8b33c208f...@linux.intel.com >> >> >> timeout = 0 is a special case, fence_wait should return 1 if signaled, or 0 >> if waiting. Not -ETIME, as i915 does currently. >> >> This means our i915_fence_wait() handler is currently very wrong too, needs >> to be fixed. It returns 0 if timeout = 0 even >> if signaled. >> >> I think it cancels the fail in our gem_object_wait, but more consistency is >> definitely needed first. >> >> I think it's best to keep the current semantics for i915_reuest_wait, but >> make it a wrapper around a >> fixed i915_request_wait_timeout(), which would have the correct return >> semantics. > > Okay you are opening up a new issue here. What I am saying is don't break > gem_wait. :) Christian's patch did not have the "<=" bug, it simply preserved > the existing behaviour. > > Then for the fence->wait() issue you raise, comment is lacking: > > * Must return -ERESTARTSYS if the wait is intr = true and the wait was > * interrupted, and remaining jiffies if fence has signaled, or 0 if wait > * timed out. Can also return other error values on custom > implementations, > * which should be treated as if the fence is signaled. For example a > hardware > * lockup could be reported like that. > > No mention of the timeout == 0 special case so that needs to be fixed as > well. Plenty of issues to work on. > > Regards, > > Tvrtko > Yeah, I fixed this in the next series, but it's a mess.
I added i915_request_wait_timeout that has dma-fence semantics, and used it inside i915_fence_wait. The second patch converted i915_gem_object_wait_reservation to use dma-fence semantics, based on Königs patch and made i915_gem_object_wait handle 0 as -ETIME as well. Still lacking the documentation update. ~Maarten