@lists.freedesktop.org
Cc: Deng, Emily ; Pan, Xinhui ; Koenig,
Christian
Subject: [RFC PATCH v2] drm/amdkfd: Run restore_workers on freezable WQs
Make restore workers freezable so we don't have to explicitly flush them in
suspend and GPU reset code paths, and we don't accidentally try to restore BOs
whi
0
[ 84.167691] softirqs last disabled at (1342671): []
__irq_exit_rcu+0xd3/0x140
[ 84.167692] ---[ end trace ]---
[ 84.189957] PM: suspe
Thanks
xinhui
-----Original Message-
From: Pan, Xinhui
Sent: Friday, November 10, 2023 12:51 PM
To: Kuehling, Felix ; amd-gfx@lists.f
[AMD Official Use Only]
If one GTT BO has been evicted/swapped out, it should sit in CPU domain.
TTM only alloc struct ttm_resource instead of struct ttm_range_mgr_node
for sysMem.
Now when we update mapping for such invalidated BOs, we might walk out
of bounds of struct ttm_resource.
Three poss
[AMD Official Use Only]
Unreserve root BO before return otherwise next allocation got deadlock.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 11 +--
1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
b/driver
在 2021/8/31 12:03,“Grodzovsky, Andrey” 写入:
On 2021-08-30 11:24 p.m., Pan, Xinhui wrote:
> [AMD Official Use Only]
>
> [AMD Official Use Only]
>
> Unreserve root BO before return otherwise next allocation got deadlock.
>
> Sign
在 2021/8/31 13:38,“Pan, Xinhui” 写入:
在 2021/8/31 12:03,“Grodzovsky, Andrey” 写入:
On 2021-08-30 11:24 p.m., Pan, Xinhui wrote:
> [AMD Official Use Only]
>
> [AMD Official Use Only]
>
> Unreserve root B
Fall through to handle the error instead of return.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index 85b292ed5c43..7ddd429052ea 100644
Fall through to handle the error instead of return.
Fixes: f8aab60422c37 ("drm/amdgpu: Initialise drm_gem_object_funcs for
imported BOs")
Cc: sta...@vger.kernel.org
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 23 ++-
1 file changed, 10 insertions(+
[AMD Official Use Only]
A long time ago, someone reports system got hung during memory test.
In recent days, I am trying to look for or understand the potential
deadlock in ttm/amdgpu code.
This patchset aims to fix the deadlock during ttm populate.
TTM has a parameter called pages_limit, when a
[AMD Official Use Only]
The ret value might be -EBUSY, caller will think lru lock is still
locked but actually NOT. So return -ENOSPC instead. Otherwise we hit
list corruption.
ttm_bo_cleanup_refs might fail too if BO is not idle. If we return 0,
caller(ttm_tt_populate -> ttm_global_swapout ->ttm
[AMD Official Use Only]
Like vce/vcn does, visible VRAM is OK for ib test.
While commit a11d9ff3ebe0 ("drm/amdgpu: use GTT for
uvd_get_create/destory_msg") says VRAM is not mapped correctly in his
platform which is likely an arm64.
So lets change back to use VRAM on x86_64 platform.
Signed-off-b
> 2021年9月6日 17:04,Christian König 写道:
>
>
>
> Am 06.09.21 um 03:12 schrieb xinhui pan:
>> A long time ago, someone reports system got hung during memory test.
>> In recent days, I am trying to look for or understand the potential
>> deadlock in ttm/amdgpu code.
>>
>> This patchset aims to fi
[AMD Official Use Only]
It is the internal staging drm-next.
-Original Message-
From: Koenig, Christian
Sent: 2021年9月6日 19:26
To: Pan, Xinhui ; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; che...@uniontech.com;
dri-de...@lists.freedesktop.org
Subject: Re: [PATCH v2 1/2] drm
> 2021年9月7日 20:37,Koenig, Christian 写道:
>
> Am 07.09.21 um 14:26 schrieb xinhui pan:
>> There is one dedicated IB pool for IB test. So lets use it for uvd msg
>> too.
>>
>> For some older HW, use one reserved BO at specific range.
>>
>> Signed-off-by: xinhui pan
>> ---
>> drivers/gpu/drm/am
> 2021年9月8日 14:23,Christian König 写道:
>
> Am 08.09.21 um 03:25 schrieb Pan, Xinhui:
>>> 2021年9月7日 20:37,Koenig, Christian 写道:
>>>
>>> Am 07.09.21 um 14:26 schrieb xinhui pan:
>>>> There is one dedicated IB pool for IB test. So lets use it fo
[AMD Official Use Only]
Direct IB pool is used for vce/uvd/vcn IB extra msg too. Increase its
size to 64 pages.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
b/driv
[AMD Official Use Only]
There is one dedicated IB pool for IB test. So lets use it for extra msg
too.
For UVD on older HW, use one reserved BO at specific range.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 173 +++-
drivers/gpu/drm/amd/amdgpu/amd
[AMD Official Use Only]
yep, vcn need 128kb extra memory. I will make the pool size constant as 256kb.
From: Koenig, Christian
Sent: Thursday, September 9, 2021 3:14:15 PM
To: Pan, Xinhui ; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander
Subject: Re
[AMD Official Use Only]
well, If IB test fails because we use gtt domain or
the above 256MB vram. Then the failure is expected.
Doesn't IB test exist to detect such issue?
发件人: Koenig, Christian
发送时间: 2021年9月9日星期四 15:16
收件人: Pan, Xinhui; am
; Koenig,
Christian ; Pan, Xinhui ;
Deucher, Alexander
Cc: Chen, Guchun ; Shi, Leslie
Subject: [PATCH] drm/ttm: add a BUG_ON in ttm_set_driver_manager when array
bounds
Vendor will define their own memory types on top of TTM_PL_PRIV,
but call ttm_set_driver_manager directly without checking mem_type
[AMD Official Use Only]
I am using vim with
set tabstop=8
set shiftwidth=8
set softtabstop=8
发件人: Koenig, Christian
发送时间: 2021年9月10日 14:33
收件人: Pan, Xinhui; amd-gfx@lists.freedesktop.org
抄送: Deucher, Alexander
主题: Re: [PATCH 4/4] drm/amdgpu: VCN avoid
[AMD Official Use Only]
I am wondering if amdgpu_bo_pin would change BO's placement in the futrue.
For now, the new placement is calculated by new = old ∩ new.
发件人: Koenig, Christian
发送时间: 2021年9月10日 14:24
收件人: Pan, Xinhui; amd-gfx@lists.freedeskto
should use
DIRECT pool.
Looks like we should only use reserved BO for direct IB submission.
As for delayed IB submission, we could alloc a new one dynamicly.
发件人: Koenig, Christian
发送时间: 2021年9月10日 16:53
收件人: Pan, Xinhui; amd-gfx@lists.freedesktop.org
抄
[AMD Official Use Only]
we need take this lock.
IB test can be triggered through debugfs. Recent days I usually test it by cat
gpu recovery and amdgpu_test_ib in debugfs.
发件人: Koenig, Christian
发送时间: 2021年9月10日 18:02
收件人: Pan, Xinhui; amd-gfx
.
发件人: Koenig, Christian
发送时间: 2021年9月10日 19:10
收件人: Pan, Xinhui; amd-gfx@lists.freedesktop.org
抄送: Deucher, Alexander
主题: Re: 回复: 回复: 回复: [PATCH 2/4] drm/amdgpu: UVD avoid memory allocation during
IB test
Yeah, but that IB test should use the indirect submission through the
scheduler
sync method.
But I see device resume itself woud flush it. So there is no race between them
as userspace is still freezed.
I will drop this flush in V2.
发件人: Christian König
发送时间: 2021年9月11日 15:45
收件人: Pan, Xinhui; amd-gfx@lists.freedesktop.org
抄送: Deucher
[AMD Official Use Only]
yep, that is a lazy way to fix it.
I am thinking of adding one amdgpu_ring.direct_access_mutex before we issue
test_ib on each ring.
发件人: Lazar, Lijo
发送时间: 2021年9月13日 12:00
收件人: Pan, Xinhui; amd-gfx@lists.freedesktop.org
抄送
: Christian König
发送时间: 2021年9月13日 14:35
收件人: Pan, Xinhui; amd-gfx@lists.freedesktop.org
抄送: Koenig, Christian; dan...@ffwll.ch; dri-de...@lists.freedesktop.org; Chen,
Guchun
主题: Re: [RFC PATCH] drm/ttm: Try to check if new ttm man out of bounds during
compile
Am 13.09.21 um 05:36 schrieb
understand.
发件人: Koenig, Christian
发送时间: 2021年9月13日 14:31
收件人: Pan, Xinhui; amd-gfx@lists.freedesktop.org
抄送: Deucher, Alexander
主题: Re: [PATCH v3 1/3] drm/amdgpu: UVD avoid memory allocation during IB test
Am 11.09.21 um 03:34 schrieb xinhui pan:
> m
[AMD Official Use Only]
These IB tests are all using direct IB submission including the delayed init
work.
发件人: Koenig, Christian
发送时间: 2021年9月13日 14:19
收件人: Pan, Xinhui; Christian König; amd-gfx@lists.freedesktop.org
抄送: Deucher, Alexander
主题: Re: 回复
; Pan, Xinhui;
amd-gfx@lists.freedesktop.org
抄送: Deucher, Alexander
主题: Re: 回复: [PATCH v2] drm/amdgpu: Fix a race of IB test
On 9/13/2021 12:21 PM, Christian König wrote:
> Keep in mind that we don't try to avoid contention here. The goal is
> rather to have as few locks as possible t
发件人: Pan, Xinhui
发送时间: 2021年9月15日 14:37
收件人: amd-gfx@lists.freedesktop.org
抄送: Deucher, Alexander; Koenig, Christian; Grodzovsky, Andrey; Pan, Xinhui
主题: [PATCH v2] drm/amdgpu: Put drm_dev_enter/exit outside hot codepath
We hit soft hang while doing memory
[AMD Official Use Only]
Reviewed-by: xinhui pan
-Original Message-
From: amd-gfx On Behalf Of Andrey
Grodzovsky
Sent: 2021年9月16日 3:42
To: amd-gfx@lists.freedesktop.org
Cc: Quan, Evan ; Pan, Xinhui ; Deucher,
Alexander ; Grodzovsky, Andrey
Subject: [PATCH] drm/amdgpu: Fix crash on
[AMD Official Use Only]
Why? just to evict some inactive vram BOs?
From: Koenig, Christian
Sent: Friday, September 17, 2021 3:06:16 PM
To: Pan, Xinhui ; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander
Subject: Re: [PATCH] drm/amdgpu: Let BO created in its
, Christian
发送时间: 2021年11月9日 20:20
收件人: Pan, Xinhui; amd-gfx@lists.freedesktop.org
抄送: dri-de...@lists.freedesktop.org
主题: Re: [PATCH] drm/ttm: Put BO in its memory manager's lru list
Am 09.11.21 um 12:19 schrieb xinhui pan:
> After we move BO to a new memory region, we should put it to
> the
omain_start(adev, mem->mem_type) +
209 mm_cur->start;
210 return 0;
211 }
line 208, *addr is zero. So when amdgpu_copy_buffer submit job with such addr,
page fault happens.
发件人: Koenig, Christian
发送时
ist is on vram domain) to sMem.
发件人: Pan, Xinhui
发送时间: 2021年11月9日 21:05
收件人: Koenig, Christian; amd-gfx@lists.freedesktop.org
抄送: dri-de...@lists.freedesktop.org
主题: 回复: 回复: [PATCH] drm/ttm: Put BO in its memory manager's lru list
Yes, a stable tag i
Christian
发送时间: 2021年11月9日 21:18
收件人: Pan, Xinhui; amd-gfx@lists.freedesktop.org
抄送: dri-de...@lists.freedesktop.org
主题: Re: 回复: 回复: [PATCH] drm/ttm: Put BO in its memory manager's lru list
Exactly that's the reason why we should have the double check in TTM
I've mentioned in t
[AMD Official Use Only - Internal Distribution Only]
No, the patch from Nirmoy did not fully fix this issue. I will send another fix
patch later.
-Original Message-
From: amd-gfx On Behalf Of Christian
K?nig
Sent: 2021年3月20日 17:08
To: Kuehling, Felix ; Paneer Selvam, Arunpravin
; amd
[AMD Official Use Only - Internal Distribution Only]
Because this is not a deadlock of lock itself.
Just because something like
while(true) {
LOCKIRQ
...
UNLOCKIRQ
...
}
I think scheduler policy is voluntary. So it never schedule out if there is no
sleep function and then soft lockup showed
[AMD Official Use Only - Internal Distribution Only]
I don’t think so. Start is offset here. We get the valid physical address from
pages_addr[offset] when we update mapping.
Btw, what issue we are seeing?
-Original Message-
From: amd-gfx On Behalf Of Christian
K?nig
Sent: 2021年3月23日 2
[AMD Official Use Only - Internal Distribution Only]
Reviewed-by: xinhui pan
From: Christian König
Sent: Wednesday, May 5, 2021 7:01:46 PM
To: Pan, Xinhui ; Deucher, Alexander
; amd-gfx@lists.freedesktop.org
Subject: [PATCH] MAINTAINERS: Add Xinhui Pan as
_
发件人: Yu Kuai
发送时间: 2021年5月17日 16:16
收件人: Deucher, Alexander; Koenig, Christian; Pan, Xinhui; airl...@linux.ie;
dan...@ffwll.ch
抄送: amd-gfx@lists.freedesktop.org; dri-de...@lists.freedesktop.org;
linux-ker...@vger.kernel.org; yuku...@huawei.com; yi.zh...@huawei.com
主题: [PATCH] drm/amdgp
Memory
TEST_F(KFDMemoryTest, MemoryAlloc) {
TEST_START(TESTPROFILE_RUNALL)
--
2.25.1
____________
发件人: Pan, Xinhui
发送时间: 2021年5月19日 10:28
收件人: amd-gfx@lists.freedesktop.org
抄送: Kuehling, Felix; Deucher, Alexander; Koenig, Christian;
dri-de...@lists.freedeskt
Chris' patch as I think it desnt help. Or I can have a
try later.
发件人: Kuehling, Felix
发送时间: 2021年5月19日 11:29
收件人: Pan, Xinhui; amd-gfx@lists.freedesktop.org
抄送: Deucher, Alexander; Koenig, Christian; dri-de...@lists.freedesktop.org;
dan...@ffwll.ch
主
PRIORITY; ++i) {
- list_for_each_entry(bo, &glob->swap_lru[i], swap) {
[snip]
+ for (i = TTM_PL_SYSTEM; i < TTM_NUM_MEM_TYPES; ++i) {
+ for (j = 0; j < TTM_MAX_BO_PRIORITY; ++j) {
________
发件人: Pan, Xinhui
发送时间: 2021
as TTM_PAGE_FLAG_SWAPPED is set.
Now here is the problem, we swapin data to ttm bakend memory from swap storage.
That just causes the memory been overwritten.
发件人: Christian König
发送时间: 2021年5月19日 18:01
收件人: Pan, Xinhui; Kuehling, Felix; amd-gfx@lists.freedesktop.org
König; Pan, Xinhui; amd-gfx@lists.freedesktop.org
抄送: Deucher, Alexander; dan...@ffwll.ch; Koenig, Christian;
dri-de...@lists.freedesktop.org
主题: Re: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout
and swapin
Looks like we're creating the userptr BO as ttm_bo_type_devi
I just sent out patch below yesterday. swapping unpopulated bo is useless
indeed.
[RFC PATCH 2/2] drm/ttm: skip swapout when ttm has no backend page.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/
[AMD Official Use Only]
I just sent out patch below yesterday. swapping unpopulated bo is useless
indeed.
[RFC PATCH 2/2] drm/ttm: skip swapout when ttm has no backend page.
发件人: Christian König
发送时间: 2021年5月20日 14:39
收件人: Pan, Xinhui; Kuehling, Felix
o.bdev->lru_lock);
ret = amdgpu_amdkfd_remove_eviction_fence(bo, ef);
dma_resv_unlock(bo->tbo.base.resv);
发件人: Kuehling, Felix
发送时间: 2021年5月22日 2:24
收件人: Pan, Xinhui; amd-gfx@lists.freedesktop.org
抄送: Deucher, Alexander; Koenig, Chris
> 2021年6月15日 20:01,Christian König 写道:
>
> Am 15.06.21 um 13:57 schrieb xinhui pan:
>> Amdgpu set SG flag in populate callback. So TTM still count pages in SG
>> BO.
>
> It's probably better to fix this instead. E.g. why does amdgpu modify the SG
> flag during populate and not during initial
> 2021年6月16日 02:22,Kuehling, Felix 写道:
>
> [+Xinhui]
>
>
> Am 2021-06-15 um 1:50 p.m. schrieb Amber Lin:
>> Calling free_mqd inside of destroy_queue_nocpsch_locked can cause a
>> circular lock. destroy_queue_nocpsch_locked is called under a DQM lock,
>> which is taken in MMU notifiers, potent
> 2021年6月16日 12:36,Kuehling, Felix 写道:
>
> Am 2021-06-16 um 12:01 a.m. schrieb Pan, Xinhui:
>>> 2021年6月16日 02:22,Kuehling, Felix 写道:
>>>
>>> [+Xinhui]
>>>
>>>
>>> Am 2021-06-15 um 1:50 p.m. schrieb Amber Lin:
>>&
> 2021年6月17日 06:55,Kuehling, Felix 写道:
>
> On 2021-06-16 4:35 a.m., xinhui pan wrote:
>> Some resource are freed even destroy queue fails.
>
> Looks like you're keeping this behaviour for -ETIME. That is consistent with
> what pqn_destroy_queue does. What you're fixing here is the behaviour f
ang) {
+ retval = -EIO;
+ goto failed_try_destroy_debugged_queue;
+ }
+
if (qpd->is_debug) {
/*
* error, currently we do not allow to destroy a queue
> 2021年6月17日 20:02,Pan, Xinhui 写道:
>
> Handle queue destroy failur
Felix
What I am wondreing is that if CP got hang, could we assume all usermode
queues have stopped?
If so, we can do cleanupwork regardless of the retval of execute_queues_cpsch().
> 2021年6月17日 20:11,Pan, Xinhui 写道:
>
> Felix
> what I am thinking of like below looks like
> 2021年7月14日 16:33,Christian König 写道:
>
> Hi Eric,
>
> feel free to push into amd-staging-dkms-5.11, but please don't push it into
> amd-staging-drm-next.
>
> The later will just cause a merge failure which Alex needs to resolve
> manually.
>
> I can take care of pushing to amd-staging-dr
[AMD Official Use Only - Internal Distribution Only]
drm_dev_alloc() alloc *dev* and set managed.final_kfree to dev to free
itself.
Now from commit 5cdd68498918("drm/amdgpu: Embed drm_device into
amdgpu_device (v3)") we alloc *adev* and ddev is just a member of it.
So drm_dev_release try to free a
[AMD Official Use Only - Internal Distribution Only]
Remove the private obj from the internal list before we free aconnector.
[ 56.925828] BUG: unable to handle page fault for address: 8f84a870a560
[ 56.933272] #PF: supervisor read access in kernel mode
[ 56.938801] #PF: error_code(0x00
of total release sequence.
Or still use the final_kfree to free adev and our release callback just do some
other cleanup work.
From: Tuikov, Luben
Sent: Wednesday, September 2, 2020 4:35:32 AM
To: Alex Deucher ; Pan, Xinhui ;
Daniel Vetter
Cc: amd-gfx@lists.freed
t;Tuikov, Luben"
日期: 2020年9月2日 星期三 09:07
收件人: "amd-gfx@lists.freedesktop.org" ,
"dri-de...@lists.freedesktop.org"
抄送: "Deucher, Alexander" , Daniel Vetter
, "Pan, Xinhui" , "Tuikov, Luben"
主题: [PATCH 0/3] Use implicit kref infra
Use
> 2020年9月2日 11:46,Tuikov, Luben 写道:
>
> On 2020-09-01 21:42, Pan, Xinhui wrote:
>> If you take a look at the below function, you should not use driver's
>> release to free adev. As dev is embedded in adev.
>
> Do you mean "look at the function below"
285 list_move(&vm_bo->vm_status, &vm_bo->vm->relocated);
>> 286 else
>> 287 amdgpu_vm_bo_idle(vm_bo);
>> 288 }
>>
>> Why you need to do the bo->parent check out side ?
because it is me that moves such logic into amdgpu_vm_bo_r
> 2020年9月2日 20:05,Christian König 写道:
>
> Calculate the correct value for max_entries or we might run after the
> page_address array.
>
> Signed-off-by: Christian König
> Fixes: 1e691e244487 drm/amdgpu: stop allocating dummy GTT nodes
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 3 ++-
> 1
> 2020年9月2日 22:05,Christian König 写道:
>
> Calculate the correct value for max_entries or we might run after the
> page_address array.
>
> v2: Xinhui pointed out we don't need the shift
>
> Signed-off-by: Christian König
> Fixes: 1e691e244487 drm/amdgpu: stop allocating dummy GTT nodes
> ---
> 2020年9月2日 22:31,Christian König 写道:
>
> Am 02.09.20 um 16:27 schrieb Pan, Xinhui:
>>
>>> 2020年9月2日 22:05,Christian König 写道:
>>>
>>> Calculate the correct value for max_entries or we might run after the
>>> page_address array.
>
> 2020年9月2日 23:21,Christian König 写道:
>
> Calculate the correct value for max_entries or we might run after the
> page_address array.
>
> v2: Xinhui pointed out we don't need the shift
> v3: use local copy of start and simplify some calculation
>
> Signed-off-by: Christian König
> Fixes: 1e6
> 2020年9月2日 22:50,Tuikov, Luben 写道:
>
> On 2020-09-02 00:43, Pan, Xinhui wrote:
>>
>>
>>> 2020年9月2日 11:46,Tuikov, Luben 写道:
>>>
>>> On 2020-09-01 21:42, Pan, Xinhui wrote:
>>>> If you take a look at the below function, you s
Reviewed-by: xinhui pan
> 2020年9月3日 17:03,Christian König 写道:
>
> Calculate the correct value for max_entries or we might run after the
> page_address array.
>
> v2: Xinhui pointed out we don't need the shift
> v3: use local copy of start and simplify some calculation
> v4: fix the case that
[AMD Official Use Only - Internal Distribution Only]
Pls ignore this patch.
-Original Message-
From: Pan, Xinhui
Sent: 2020年9月29日 13:17
To: amd-gfx@lists.freedesktop.org
Cc: Koenig, Christian ; Deucher, Alexander
; Pan, Xinhui
Subject: [PATCH] amd/amdgpu: Fix resv shared fence
in its
suspend callback. SO the first eviction before kfd callback likely fails.
-Original Message-
From: Christian König
Sent: Friday, September 8, 2023 2:49 PM
To: Pan, Xinhui ; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Koenig, Christian
; Fan, Shikang
Subject: Re: [PATCH
tTest.BasicTest
pm-suspend
thanks
xinhui
发件人: Christian König
发送时间: 2023年9月12日 17:01
收件人: Pan, Xinhui ; amd-gfx@lists.freedesktop.org
抄送: Deucher, Alexander ; Koenig, Christian
; Fan, Shikang
主题: Re: [PATCH] drm/amdgpu: Ignore first evction failure du
, Christian
Sent: Wednesday, September 13, 2023 10:29 PM
To: Kuehling, Felix ; Christian König
; Pan, Xinhui ;
amd-gfx@lists.freedesktop.org; Wentland, Harry
Cc: Deucher, Alexander ; Fan, Shikang
Subject: Re: 回复: [PATCH] drm/amdgpu: Ignore first evction failure during suspend
[+Harry]
Am
ce->flags))
+ goto out;
+
if (intr && signal_pending(current)) {
ret = -ERESTARTSYS;
goto out;
发件人: Koenig, Christian
发送时间: 2022年4月12日 20:11
收件人: Pan, Xinhui; amd-gfx@lists.freedesktop.org; Dani
: 2022年4月13日 15:30
收件人: Pan, Xinhui; amd-gfx@lists.freedesktop.org
抄送: Deucher, Alexander
主题: AW: [PATCH] drm/amdgpu: Make sure ttm delayed work finished
We don't need that.
TTM only reschedules when the BOs are still busy.
And if the BOs are still busy when you unload the driver we have
Christian König ; Pan, Xinhui
; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Koenig, Christian
Subject: Re: [PATCH] drm/amdgpu: Fix a NULL pointer of fence
Am 2022-07-07 um 05:54 schrieb Christian König:
> Am 07.07.22 um 11:50 schrieb xinhui pan:
>> Fence is accessed by dma_r
[AMD Official Use Only - General]
Hi Arun,
Thanks for your reply. comments are inline.
发件人: Paneer Selvam, Arunpravin
发送时间: 2022年11月29日 1:09
收件人: Pan, Xinhui; amd-gfx@lists.freedesktop.org
抄送: linux-ker...@vger.kernel.org; dri-de...@lists.freedesktop.org
just re-sort these blocks in ascending order if memory is
indeed continuous?
thanks
xinhui
发件人: Christian König
发送时间: 2022年11月29日 1:11
收件人: Pan, Xinhui; amd-gfx@lists.freedesktop.org
抄送: Deucher, Alexander
主题: Re: [PATCH] drm/amdgpu: New method to check
goto err_free;
thanks
xinhui
____________
发件人: Pan, Xinhui
发送时间: 2022年11月29日 18:56
收件人: amd-gfx@lists.freedesktop.org
抄送: dan...@ffwll.ch; matthew.a...@intel.com; Koenig, Christian;
dri-de...@lists.freedesktop.org; linux-ker...@vger.kernel.org; Paneer Se
[AMD Official Use Only - General]
comments inline.
发件人: Koenig, Christian
发送时间: 2022年11月29日 19:32
收件人: Pan, Xinhui; amd-gfx@lists.freedesktop.org
抄送: dan...@ffwll.ch; matthew.a...@intel.com; dri-de...@lists.freedesktop.org;
linux-ker...@vger.kernel.org
[AMD Official Use Only - General]
comments line.
发件人: Koenig, Christian
发送时间: 2022年11月29日 20:07
收件人: Pan, Xinhui; amd-gfx@lists.freedesktop.org
抄送: dan...@ffwll.ch; matthew.a...@intel.com; dri-de...@lists.freedesktop.org;
linux-ker...@vger.kernel.org
[AMD Official Use Only - General]
Can we just add kref for entity?
Or just collect such job time usage somewhere else?
-Original Message-
From: Pan, Xinhui
Sent: Thursday, August 17, 2023 1:05 PM
To: amd-gfx@lists.freedesktop.org
Cc: Tuikov, Luben ; airl...@gmail.com;
dri-de
Really cool patch!
Reviewed-by: xinhui pan
-Original Message-
From: Kuehling, Felix
Sent: 2019年11月26日 3:35
To: amd-gfx@lists.freedesktop.org; Pan, Xinhui
Subject: [PATCH 1/1] drm/amdgpu: Optimize KFD page table reservation
Be less pessimistic about estimated page table use for KFD
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 6 ++
1 file changed, 6 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index 8b1088dac686..1df6b03a3680 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
Suspend will put irq, so resume need get irq back.
And in the same time, skip other ras initialization.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 5 -
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 5 -
drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 6 +-
3 files cha
Looks good to me.
Thanks.
-Original Message-
From: Evan Quan
Sent: 2019年3月7日 15:01
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Pan, Xinhui ;
Deucher, Alexander ; Quan, Evan
Subject: [PATCH] drm/amdgpu: fix ras parameter descriptions
The descriptions of modinfo wrongly show
e ignore bit.
From: Andrey Grodzovsky
Sent: Saturday, March 9, 2019 6:29:36 AM
To: amd-gfx@lists.freedesktop.org
Cc: Pan, Xinhui; Grodzovsky, Andrey
Subject: [PATCH v2] drm/amdgpu: Fix lockdep warning in RAS SYSFS v2
Problem:
When loading driver with debug lockdep
Ta is optional, so check if ta firmware is loaded or not.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 7e3e1d588d74..0bd9df9fd289
Inline.
-Original Message-
From: Evan Quan
Sent: 2019年3月11日 12:31
To: amd-gfx@lists.freedesktop.org
Cc: Pan, Xinhui ; Deucher, Alexander
; Quan, Evan
Subject: [PATCH 1/2] drm/amdgpu: unify the way to judge RAS feature readiness
Unify the way to judge whether a specific RAS feature
Cc: Pan, Xinhui ; Deucher, Alexander
; Quan, Evan
Subject: [PATCH 2/2] drm/amdgpu: drop unnecessary dereference
It's unnecessary and confusing.
Change-Id: I77fe54a108b7ee2031851b3e11d63c4fb74c0d43
Signed-off-by: Evan Quan
---
drivers/gpu/drm/amd/amdgpu/gfx_v9_0
adev->gfx.ras_if is a pointer and it is NULL at the first time
-Original Message-
From: Quan, Evan
Sent: 2019年3月11日 13:41
To: Pan, Xinhui ; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander
Subject: RE: [PATCH 2/2] drm/amdgpu: drop unnecessary dereference
I cannot get your po
Make sense. I am fixing some other ras problems. Will send them out later.
From: Zhang, Hawking
Sent: 2019年3月11日 13:42
To: Pan, Xinhui ; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Gao, Likun
Subject: RE: [PATCH] drm/amdgpu: Fix NULL pointer when ta is missing
Would it make sense
Hi
This is to fix some issues.
1) null pointer of psp when ta is missing
2) lockdep warning.
3) add hw_supported member which indicate the hardware ability.
4) add amdgpu_ras_post_init to do some initialization.
xinhui pan (4):
drm/amdgpu: Fix NULL pointer when ta is missing
drm/amdgpu: Fix wa
Unzero char is accepted by sscanf, so when data is structure but
unexpectedly return error invalid;
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
b/drivers/gpu/drm/a
It is sysfs_attr_init() that set the attribute->key to a static variable.
I think the proper fix should just call sysfs_attr_init for each attribute.
Thanks
xinhui
-Original Message-
From: Michel Dänzer
Sent: 2019年3月11日 18:42
To: Pan, Xinhui
Cc: amd-gfx@lists.freedesktop.org; Deuc
lockdep need a static key.
Previously we set ignore bit to avoid the warning.
Now call sysfs_attr_init to initialize the static key.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 8 ++--
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/a
I have fixed it and pushed in the internal branch. Sorry for typo.
-Original Message-
From: Paul Menzel
Sent: 2019年3月12日 0:00
To: Pan, Xinhui
Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander
; Michel Dänzer ; Quan, Evan
Subject: Re: [PATCH] drm/amdgpu: Fix lockdep warking more
add drm info output if ras initialized successfully.
add ras atomfirmware sanity check.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 9 +++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
b/drivers/gpu/drm/am
add ras post init function.
Do some initialization after all IP have finished their late init.
Add new member flags which will control the ras work flow.
For now, vbios enable ras for us on boot. That might change in the
future.
So there should be a flag from vbios to tell us if ras is enabled or
1 - 100 of 205 matches
Mail list logo