On 11/6/2024 8:42 PM, Alex Deucher wrote:
> On Wed, Nov 6, 2024 at 1:49 AM Victor Zhao <victor.z...@amd.com> wrote:
>>
>> From: Monk Liu <monk....@amd.com>
>>
>> As cache GTT buffer is snooped, this way the coherence between CPU write
>> and GPU fetch is guaranteed, but original code uses WC + unsnooped for
>> HIQ PQ(ring buffer) which introduces coherency issues:
>> MEC fetches a stall data from PQ and leads to MEC hang.
> 
> Can you elaborate on this?  I can see CPU reads being slower because
> the memory is uncached, but the ring buffer is mostly writes anyway.
> IIRC, the driver uses USWC for most if not all of the other ring
> buffers managed by the kernel.  Why aren't those a problem?

We have this on other rings -
        mb();
        amdgpu_ring_set_wptr(ring);

I think the solution should be to use barrier before write pointer
updates rather than relying on PCIe snooping.

Thanks,
Lijo

> 
> Alex
> 
>>
>> Signed-off-by: Monk Liu <monk....@amd.com>
>> ---
>>  drivers/gpu/drm/amd/amdkfd/kfd_device.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
>> b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> index 1f1d79ac5e6c..fb087a0ff5bc 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> @@ -779,7 +779,7 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>>         if (amdgpu_amdkfd_alloc_gtt_mem(
>>                         kfd->adev, size, &kfd->gtt_mem,
>>                         &kfd->gtt_start_gpu_addr, &kfd->gtt_start_cpu_ptr,
>> -                       false, true)) {
>> +                       false, false)) {
>>                 dev_err(kfd_device, "Could not allocate %d bytes\n", size);
>>                 goto alloc_gtt_mem_failure;
>>         }
>> --
>> 2.34.1
>>

Reply via email to