Am 27.03.25 um 10:37 schrieb SRINIVASAN SHANMUGAM:
> On 3/27/2025 2:54 PM, Christian König wrote:
>>>>> Over all this change doesn't seem to make much sense to me.
>>>>> Why exactly is isolation->spearhead not pointing to the dummy kernel job 
>>>>> we submit?
>>>> Does the owner check or gang_submit check in
>>>> amdgpu_device_enforce_isolation() fail to set up the spearhead?
>>> I'm currently debugging exactly that.
>>>
>>> Good news is that I can reproduce the problem.
>>
>> I have to take that back. I've tested the cleaner shader functionality a bit 
>> this morning and as far as I can see this works exactly as intended.
>>
>> Srini, what exactly is your use case which doesn't work?
>
> Hi Christian, Good Morning!
>
> The usecase is to trigger the cleaner shader, using sysfs 
> "run_cleaner_shader" independent of  enabling "enforce_isolation", so that 
> cleaner shader packet gets submitted to COMP_1.0.0 ring by default, without 
> prior enabling any enforce_isolation via sysfs,
>

I've tested exactly that and it seems to work perfectly fine:
   kworker/u96:1-209     [020] .....    86.655999: amdgpu_isolation: 
prev=0000000000000000, next=ffffffffffffffff
   kworker/u96:1-209     [020] .....    86.656190: amdgpu_cleaner_shader: 
ring=gfx_0.0.0, seqno=2
           <...>-11      [022] .....   150.607688: amdgpu_isolation: 
prev=ffffffffffffffff, next=0000000000000000
   kworker/u96:0-11      [022] .....   150.608228: amdgpu_cleaner_shader: 
ring=comp_1.0.0, seqno=2
   kworker/u96:0-11      [022] .....   150.620597: amdgpu_isolation: 
prev=0000000000000000, next=ffffffffffffffff
   kworker/u96:0-11      [022] .....   150.620624: amdgpu_cleaner_shader: 
ring=gfx_0.0.0, seqno=1527


The only thing which might be confusing is that when you issue the cleaner 
shader multiple times when the GPU is idle it would only run once.

But that should be easy to change if necessary.

Regards,
Christian.

> AFAIK, this "isolation->spearhead" initialization is not being takencare in 
> this *path **"amdgpu_gfx_run_cleaner_shader -> 
> amdgpu_gfx_run_cleaner_shader_job" (ie., when we trigger *cleaner shader, 
> using sysfs "run_cleaner_shader"), and this check 
> "*&job->base.s_fence->scheduled == isolation->spearhead;" * is having the 
> problem ie., "*&job->base.s_fence->scheduled" address are is not matching 
> with**"**isolation->spearhead" address, which results into zero & thus fails 
> to emit cleaner shader, when running using "run_cleaner_shader" sysfs entry, 
> **in "amdgpu_vm_flush()" function
> *
>
> Best regards,
>
> Srini
>
>>
>> Regards,
>> Christian.
>>
>>> Regards,
>>> Christian.

Reply via email to