Am 2021-11-22 um 11:16 a.m. schrieb Liu, Shaoyun:
> [AMD Official Use Only]
>
> Thanks for the review .
> The hash for the previous change from gerrirgit/amd-staging-drm-next branch 
> is 7079e7d5c6bf248bff,  so there is another drm-next branch that not in the  
> gerritgit for upstream ? 

Yes. amd-staging-drm-next is our AMD internal branch. Alex sends pull
requests to Dave Airlie's for his drm-next branch where they get
integrated with all the other DRM driver changes. That usually results
in different commit hashes.

Regards,
  Felix


>
> Thanks 
> Shaoyun.liu
>
>
> -----Original Message-----
> From: Kuehling, Felix <felix.kuehl...@amd.com> 
> Sent: Monday, November 22, 2021 10:40 AM
> To: Liu, Shaoyun <shaoyun....@amd.com>; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amd/amdgpu: move kfd post_reset out of reset_sriov 
> function
>
> Am 2021-11-18 um 11:57 a.m. schrieb shaoyunl:
>> For sriov XGMI  configuration, the host driver will handle the hive 
>> reset, so in guest side, the reset_sriov only be called once on one 
>> device. This will make kfd post_reset unblanced with kfd pre_reset 
>> since kfd pre_reset already been moved out of reset_sriov function. 
>> Move kfd post_reset out of reset_sriov function to make them balance .
>>
>> Signed-off-by: shaoyunl <shaoyun....@amd.com>
> Please change the headline prefix to "drm/amdgpu: ". The extra "/amd" is 
> redundant. And I'd also add a tag
>
> Fixes: 9f4f2c1a3524 ("drm/amd/amdgpu: fix the kfd pre_reset sequence in
> sriov")
>
> Note that the commit hash is the one from the drm-next branch, which is what 
> will get merged into master eventually. With those changes, the patch is
>
> Reviewed-by: Felix Kuehling <felix.kuehl...@amd.com>
>
>
>> ---
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 +++----
>>  1 file changed, 3 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 10c8008d1da0..9a9d5493c676 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -4308,7 +4308,6 @@ static int amdgpu_device_reset_sriov(struct 
>> amdgpu_device *adev,
>>  
>>      amdgpu_irq_gpu_reset_resume_helper(adev);
>>      r = amdgpu_ib_ring_tests(adev);
>> -    amdgpu_amdkfd_post_reset(adev);
>>  
>>  error:
>>      if (!r && adev->virt.gim_feature & AMDGIM_FEATURE_GIM_FLR_VRAMLOST) 
>> { @@ -5081,7 +5080,7 @@ int amdgpu_device_gpu_recover(struct 
>> amdgpu_device *adev,
>>  
>>      tmp_vram_lost_counter = atomic_read(&((adev)->vram_lost_counter));
>>      /* Actual ASIC resets if needed.*/
>> -    /* TODO Implement XGMI hive reset logic for SRIOV */
>> +    /* Host driver will handle XGMI hive reset for SRIOV */
>>      if (amdgpu_sriov_vf(adev)) {
>>              r = amdgpu_device_reset_sriov(adev, job ? false : true);
>>              if (r)
>> @@ -5141,8 +5140,8 @@ int amdgpu_device_gpu_recover(struct 
>> amdgpu_device *adev,
>>  
>>  skip_sched_resume:
>>      list_for_each_entry(tmp_adev, device_list_handle, reset_list) {
>> -            /* unlock kfd: SRIOV would do it separately */
>> -            if (!need_emergency_restart && !amdgpu_sriov_vf(tmp_adev))
>> +            /* unlock kfd */
>> +            if (!need_emergency_restart)
>>                      amdgpu_amdkfd_post_reset(tmp_adev);
>>  
>>              /* kfd_post_reset will do nothing if kfd device is not 
>> initialized,

Reply via email to