On Tue, 2022-01-04 at 13:56 +0000, Tvrtko Ursulin wrote:
> 
> > The flow of events are as below:
> > 
> > 1. guc sends notification that an error capture was done and ready to take.
> >     - at this point we copy the guc error captured dump into an interim 
> > store
> >       (larger buffer that can hold multiple captures).
> > 2. guc sends notification that a context was reset (after the prior)
> >     - this triggers a call to i915_gpu_coredump with the corresponding 
> > engine-mask
> >            from the context that was reset
> >     - i915_gpu_coredump proceeds to gather entire gpu state including 
> > driver state,
> >            global gpu state, engine state, context vmas and also engine 
> > registers. For the
> >            engine registers now call into the guc_capture code which merely 
> > needs to verify
> >       that GuC had already done a step 1 and we have data ready to be 
> > parsed.
> 
> What about the time between the actual reset and receiving the context 
> reset notification? Latter will contain intel_context->guc_id - can that 
> be re-assigned or "retired" in between the two and so cause problems for 
> matching the correct (or any) vmas?
> 
Not it cannot because its only after the context reset notification that i915 
starts
taking action against that cotnext - and even that happens after the 
i915_gpu_codedump (engine-mask-of-context) happens.
That's what i've observed in the code flow.

> Regards,
> 
> Tvrtko

Reply via email to