r
Cc: Lazar, Lijo , SHANMUGAM, SRINIVASAN
, Liu, Monk ,
amd-gfx@lists.freedesktop.org , Zhang, Hawking
Subject: [PATCH] drm/amdgpu: Fix 'adev->gfx.rlc_fw' from request_firmware() not
released in 'gfx_v10_0_init_microcode()'
'adev->gfx.rlc_fw' may not be relea
Cc: Liu, Monk ; Koenig, Christian ;
Grodzovsky, Andrey ; Jack Zhang
; Chen, JingWen ; Jack Zhang
; Grodzovsky, Andrey
Subject: [PATCH v5] drm/amd/amdgpu embed hw_fence into amdgpu_job
From: Jack Zhang
Why: Previously hw fence is alloced separately with job.
It caused historical lifetime
Cc: Liu, Monk ; Koenig, Christian ;
Grodzovsky, Andrey ; Chen, JingWen
Subject: [PATCH] Revert "drm/scheduler: Avoid accessing freed bad job."
[Why]
for bailing job, this commit will delete it from pending list thus the bailing
job will never have a chance to be resubmitted even in a
s
--
Monk Liu | Cloud-GPU Core team
--
-Original Message-
From: Daniel Vetter
Sent: Wednesday, August 18, 2021 10:43 PM
To: Grodzovsky, Andrey
Cc: Daniel Vetter ; Alex Deucher ;
Chen, JingWen ; Maling list -
t from later to earlier one and either deactive
* their HW callbacks or remove them from pending list if they already
* signaled.
Thanks
--
Monk Liu | Cloud-GPU Core team
--
-Original Message-
From: Daniel Vetter
Sent: Thur
x27;t
impact performance since in that scenario
There was already something wrong/stuck on that ring/scheduler
Thanks
--
Monk Liu | Cloud-GPU Core team
--
-Original Message-
From: Liu, Monk
Sent: Thursday, Augu
---
Monk Liu | Cloud-GPU Core team
----------
-Original Message-
From: Grodzovsky, Andrey
Sent: Friday, August 20, 2021 10:07 PM
To: Liu, Monk ; Daniel Vetter ; Koenig,
Christian
Cc: Alex Deucher ; Chen, JingWen
; Maling list - DRI d
u | Cloud-GPU Core team
--
-Original Message-
From: Grodzovsky, Andrey
Sent: Tuesday, August 24, 2021 11:02 PM
To: Liu, Monk ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation
On 2021-08-24 10:46 a.m.,
--
-Original Message-
From: Christian König
Sent: Wednesday, August 25, 2021 2:32 PM
To: Liu, Monk ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v2)
Well NAK to that approach. First of all your bug analyses
y
Thanks
--
Monk Liu | Cloud-GPU Core team
--
-Original Message-----
From: Liu, Monk
Sent: Wednesday, August 25, 2021 7:55 PM
To: 'Christian König' ;
amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH] drm/sched: fix the bug of time out calculatio
Liu | Cloud-GPU Core team
--
-Original Message-
From: Christian König
Sent: Wednesday, August 25, 2021 8:11 PM
To: Liu, Monk ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v2)
No, th
unning there, the whole counting is repeated from zero and inaccurate at all
Thanks
--
Monk Liu | Cloud-GPU Core team
--
-Original Message-
From: Grodzovsky, Andrey
Sent: Thursday, August 26, 2021 2:20 A
---
Monk Liu | Cloud-GPU Core team
--
-Original Message-
From: Grodzovsky, Andrey
Sent: Thursday, August 26, 2021 11:05 AM
To: Liu, Monk ; Christian König
; amd-gfx@lists.freedesktop.org; dri-devel
Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v
--
-Original Message-
From: Christian König
Sent: Thursday, August 26, 2021 6:09 PM
To: Liu, Monk ; amd-gfx@lists.freedesktop.org
Cc: dri-de...@lists.freedesktop.org
Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)
Am 26.08.21 um 06:5
Cloud-GPU Core team
--
-Original Message-
From: Grodzovsky, Andrey
Sent: Friday, August 27, 2021 4:14 AM
To: Liu, Monk ; amd-gfx@lists.freedesktop.org; Koenig,
Christian
Cc: dri-de...@lists.freedesktop.org
Subject: Re: [PATCH] drm/sched: fix the bug of tim
riginal Message-
From: Christian König
Sent: Friday, August 27, 2021 2:12 PM
To: Grodzovsky, Andrey ; Liu, Monk
; amd-gfx@lists.freedesktop.org; Koenig, Christian
Cc: dri-de...@lists.freedesktop.org
Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)
I don't thi
_DELAYED_WORK(&sched->work_tdr, drm_sched_job_timedout);
atomic_set(&sched->_score, 0);
atomic64_set(&sched->job_id_count, 0);
Thanks
--
Monk Liu | Cloud-GPU Core team
------
-Original Message-
Sent: Tuesday, August 31, 2021 6:36 PM
To: amd-gfx@lists.freedesktop.org
Cc: dri-de...@lists.freedesktop.org; Liu, Monk
Subject: [PATCH 1/2] drm/sched: fix the bug of time out calculation(v3)
issue:
in cleanup_job the cancle_delayed_work will cancel a TO timer even the its
corresponding job is still
ing there, they should exclusively access those common members like
sched_job. (due to spin_lock is off before running into vendor's calback)
Hope I explained ourselves well enough.
Thanks
--
Monk Liu | Cloud-GPU Core team
---------
o the
warning issue I hit.
Thanks
--
Monk Liu | Cloud-GPU Core team
--
-Original Message-
From: Daniel Vetter
Sent: Tuesday, August 31, 2021 9:02 PM
To: Liu, Monk
Cc: amd-gfx@lists.freedesktop
[AMD Official Use Only]
Hi Daniel/Christian/Andrey
It looks the voice from you three are spread over those email floods to me, the
feature we are working on (diagnostic TDR scheme) is pending there for more
than 6 month (we started it from feb 2021).
Honestly speaking the email ways that we ar
[AMD Official Use Only]
Okay, I will reprepare this patch
Thanks
--
Monk Liu | Cloud-GPU Core team
--
-Original Message-
From: Daniel Vetter
Sent: Tuesday, August 31, 2021 9:02 PM
To: Liu, Monk
Cc: amd
.: X server)
Thanks
--
Monk Liu | Cloud-GPU Core team
--
-Original Message-
From: Daniel Vetter
Sent: Tuesday, August 31, 2021 9:09 PM
To: Koenig, Christian
Cc: Grodzovsky, Andrey ; Christian König
;
anup_job won't return it so sched_main won't free it in parallel ...
What do you think ?
Thanks
--
Monk Liu | Cloud-GPU Core team
----------
From: Liu, Monk
Sent: Wednesday, September 1, 2021 9:23 AM
To: Koenig,
From: amd-gfx On Behalf Of Daniel Vetter
Sent: Wednesday, September 1, 2021 4:18 PM
To: Liu, Monk
Cc: Koenig, Christian ; Grodzovsky, Andrey
; Chen, JingWen ; DRI
Development ; amd-gfx@lists.freedesktop.org
Subject: Re: [diagnostic TDR mode patches] unify our solution
opinions/suggestions in o
ange inside AMD
driver please always let us know and review.
Thanks
--
Monk Liu | Cloud-GPU Core team
----------
-Original Message-
From: amd-gfx On Behalf Of Daniel Vetter
Sent: Wednesday, September 1, 2021 4:18 PM
To: Liu, Monk
Cc: Koenig, Christ
essage-
From: Dave Airlie
Sent: Thursday, September 2, 2021 2:51 AM
To: Alex Deucher
Cc: Liu, Monk ; Daniel Vetter ; Koenig,
Christian ; Grodzovsky, Andrey
; Chen, JingWen ; DRI
Development ; amd-gfx@lists.freedesktop.org
Subject: Re: [diagnostic TDR mode patches] unify our solution
opini
-Original Message-
From: Daniel Vetter
Sent: Friday, September 3, 2021 12:11 AM
To: Koenig, Christian
Cc: Liu, Monk ; Dave Airlie ; Alex Deucher
; Grodzovsky, Andrey ; Chen,
JingWen ; DRI Development
; amd-gfx@lists.freedesktop.org
Subject: Re: [diagnostic TDR mode patches] unify our solution
opi
@lists.freedesktop.org
Cc: Grodzovsky, Andrey ; Quan, Evan
; Chen, Horace ; Tuikov, Luben
; Koenig, Christian ; Deucher,
Alexander ; Xiao, Jack ; Zhang,
Hawking ; Liu, Monk ; Xu, Feifei
; Wang, Kevin(Yang) ; Xiaojie Yuan
Subject: [PATCH] drm/amdgpu: move vram recover into sriov full access
[what
and your point here.
Thanks very much, and please understand our painful here
/Monk
-邮件原件-
发件人: Koenig, Christian
发送时间: 2021年3月26日 17:06
收件人: Zhang, Jack (Jian) ; Grodzovsky, Andrey
; Christian König
; dri-de...@lists.freedesktop.org;
amd-gfx@lists.freedesktop.org; Liu, Monk ; Deng,
m: Koenig, Christian
Sent: Friday, March 26, 2021 10:51 PM
To: Liu, Monk ; Zhang, Jack (Jian) ;
Grodzovsky, Andrey ; Christian König
; dri-de...@lists.freedesktop.org;
amd-gfx@lists.freedesktop.org; Deng, Emily ; Rob Herring
; Tomeu Vizoso ; Steven Price
Cc: Zhang, Andy ; Jiang, Jerry (S
[AMD Official Use Only - Internal Distribution Only]
Reviewed-by: Monk Liu
Thanks
--
Monk Liu | Cloud-GPU Core team
--
-Original Message-
From: Deng, Emily
Sent: Thursday, April 1, 2021 2:04 PM
To: Liu
[AMD Official Use Only - Internal Distribution Only]
Reviewed-by: Monk liu
Better get open source team's RB as well
Thanks
--
Monk Liu | Cloud-GPU Core team
--
-Original Message-
From: amd-gfx On Behal
; amd-gfx@lists.freedesktop.org
Cc: dan...@ffwll.ch; Liu, Monk ; Chen, Horace
Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection
for SRIOV
Am 22.12.21 um 23:14 schrieb Andrey Grodzovsky:
> Since now flr work is serialized against GPU resets there is no need
> for this.
&g
t: Tuesday, January 4, 2022 6:19 PM
To: Chen, JingWen ; Christian König
; Grodzovsky, Andrey
; Deng, Emily ; Liu, Monk
; dri-de...@lists.freedesktop.org;
amd-gfx@lists.freedesktop.org; Chen, Horace ; Chen,
JingWen
Cc: dan...@ffwll.ch
Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset p
echanism
Thanks
--
Monk Liu | Cloud-GPU Core team
--
-Original Message-
From: Jingwen Chen
Sent: Wednesday, May 26, 2021 2:55 PM
To: amd-gfx@lists.freedesktop.org
Cc: Liu, Monk ; Zhao, Victor ; Chen,
[AMD Official Use Only]
Looks it lack enough background for people to review:
- if (adev->vcn.inst[i].ring_dec.sched.ready)
+ if (adev->vcn.inst[i].ring_dec.sched.ready ||
+ (adev->asic_type == CHIP_NAVI12 &&
+
: Liu, Monk ; Zhao, Victor ; Chen,
JingWen
Subject: [PATCHv2] drm/amd/amdgpu:save psp ring wptr to avoid attack
From: Victor Zhao
[Why]
When some tools performing psp mailbox attack, the readback value of register
can be a random value which may break psp.
[How]
Use a psp wptr cache machanism
, Alexander ; Quan,
Evan ; Koenig, Christian ; Liu,
Monk ; Zhang, Hawking
主题: [PATCH] drm/amdgpu: reset psp ring wptr during ring_create
[Why]
psp ring wptr is not initialized properly in ring_create, which would lead to
psp failure after several gpu reset.
[How]
Set ring_wptr to zero in
[AMD Official Use Only]
Reviewed-by: Monk Liu
Thanks
--
Monk Liu | Cloud-GPU Core team
--
-Original Message-
From: Yifan Zha
Sent: Friday, June 11, 2021 6:49 PM
To: amd-gfx@lists.freedesktop.org
Cc: Liu
-
-Original Message-
From: YuBiao Wang
Sent: Tuesday, June 29, 2021 6:01 PM
To: amd-gfx@lists.freedesktop.org
Cc: Grodzovsky, Andrey ; Quan, Evan
; Chen, Horace ; Tuikov, Luben
; Koenig, Christian ; Deucher,
Alexander ; Xiao, Jack ; Zhang,
Hawking ; Liu, Monk ; Xu, Feifei
; Wang
: Grodzovsky, Andrey ; Quan, Evan
; Chen, Horace ; Tuikov, Luben
; Koenig, Christian ; Deucher,
Alexander ; Xiao, Jack ; Zhang,
Hawking ; Liu, Monk ; Xu, Feifei
; Wang, Kevin(Yang) ; Wang, YuBiao
Subject: [PATCH 1/1] drm/amdgpu: Read clock counter via MMIO to reduce delay
(v4)
[Why]
GPU
; Quan,
Evan ; Koenig, Christian ; Liu,
Monk ; Zhang, Hawking
Subject: Re: [PATCH 1/1] drm/amdgpu: Read clock counter via MMIO to reduce
delay (v4)
Am 30.06.21 um 12:10 schrieb YuBiao Wang:
> [Why]
> GPU timing counters are read via KIQ under sriov, which will introduce
> a delay.
>
&
: Liu, Monk ; Chen, Horace
Subject: Re: [PATCH] drm/amdgpu: SRIOV flr_work should take write_lock
ping..
On Thu Jul 01, 2021 at 10:22:57AM +0800, Jingwen Chen wrote:
> [Why]
> If flr_work takes read_lock, then other threads who takes read_lock
> can access hardware when host is doi
[AMD Official Use Only]
Reviewed by: Monk Liu
Thanks
--
Monk Liu | Cloud-GPU Core team
--
-Original Message-
From: amd-gfx On Behalf Of Deng, Emily
Sent: Wednesday, July 7, 2021 3:42 PM
To: Deng, Emily ;
[AMD Official Use Only]
Reviewed-by: Monk Liu
You might need @Liu, Leo's review as well
Thanks
--
Monk Liu | Cloud-GPU Core team
--
-Original Message-
From: amd-gfx On Behalf Of Alex Deucher
Sent: Wedne
p.org
Cc: Chen, Horace ; Liu, Monk
Subject: Re: [PATCH] drm/amd/amdgpu: vm entities should have kernel priority
Am 19.07.21 um 07:57 schrieb Jingwen Chen:
> [Why]
> Current vm_pte entities have NORMAL priority, in SRIOV multi-vf use
> case, the vf flr happens first and then job tim
ty ?
Thanks
--
Monk Liu | Cloud-GPU Core team
--
-Original Message-----
From: Liu, Monk
Sent: Monday, July 19, 2021 5:40 PM
To: 'Christian König' ; Chen, JingWen
; amd-gfx@lists.freedesktop.org
Cc: Chen, Horace
Su
riginal Message-
From: Jingwen Chen
Sent: Friday, July 23, 2021 12:42 PM
To: amd-gfx@lists.freedesktop.org
Cc: Liu, Monk ; Grodzovsky, Andrey
; ckoenig.leichtzumer...@gmail.com; Zhang, Jack
(Jian) ; Chen, JingWen
Subject: [PATCH 1/2] drm/amd/amdgpu embed hw_fence into amdgpu_job
From: Jac
-GPU Core team
--
-Original Message-
From: Jingwen Chen
Sent: Friday, July 23, 2021 3:07 PM
To: Christian König ; Grodzovsky, Andrey
; amd-gfx@lists.freedesktop.org
Cc: Chen, Horace ; Liu, Monk ; Zhang,
Jack (Jian)
Subject: Re: [PATCH 2/2] drm: add
, Felix
Sent: Friday, July 31, 2020 10:20 AM
To: Liu, Monk ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: introduce a new parameter to configure how
many KCQ we want(v3)
Am 2020-07-30 um 10:11 p.m. schrieb Liu, Monk:
> [AMD Official Use Only - Internal Distribution O
From: Kuehling, Felix
Sent: Friday, July 31, 2020 12:54 PM
To: Liu, Monk ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: introduce a new parameter to configure how
many KCQ we want(v4)
Am 2020-07-30 um 11:42 p.m. schrieb Monk Liu:
> what:
> the MQD's save and restor
[AMD Official Use Only - Internal Distribution Only]
Please check V5 and see if it looks better
I use automatic indentation on those lines
_
Monk Liu|GPU Virtualization Team |AMD
-Original Message-
From: amd-gfx On Behalf Of Liu, Monk
Sent: Friday
rtualization Team |AMD
-Original Message-
From: Kuehling, Felix
Sent: Friday, July 31, 2020 9:57 PM
To: Liu, Monk ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 1/2] drm/amdgpu: fix reload KMD hang on KIQ
In gfx_v10_0_sw_fini the KIQ ring gets freed. Wouldn't that be the right place
[AMD Official Use Only - Internal Distribution Only]
Ping ... this is a severe bug fix
_
Monk Liu|GPU Virtualization Team |AMD
-Original Message-
From: amd-gfx On Behalf Of Liu, Monk
Sent: Monday, August 3, 2020 9:55 AM
To: Kuehling, Felix ; amd-gfx
[AMD Official Use Only - Internal Distribution Only]
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
@@ -238,19 +238,12 @@ static void xgpu_ai_mailbox_flr_work(struct work_struct
*work)
struct amdgpu_virt *virt = container_of(work, struct amdgpu_virt, flr_
to occupy GPU with 1) take the
reset_sem first to prevent guest side GPU recovery routine occupy GPU; 2) mark
GPU under reset by set "in_gpu_rest" to true
Thanks
_
Monk Liu|GPU Virtualization Team |AMD
-Original Message-
From: Li, Dennis
[AMD Official Use Only - Internal Distribution Only]
Reviewed-by: Monk Liu
_
Monk Liu|GPU Virtualization Team |AMD
-Original Message-
From: Dennis Li
Sent: Friday, August 21, 2020 11:48 AM
To: amd-gfx@lists.freedesktop.org; Deucher, Alexander
[AMD Official Use Only - Internal Distribution Only]
Reviewed-by: Monk Liu
_
Monk Liu|GPU Virtualization Team |AMD
-Original Message-
From: amd-gfx On Behalf Of Bokun Zhang
Sent: Wednesday, August 5, 2020 11:32 PM
To: amd-gfx@lists.freedesktop.org
[AMD Official Use Only - Internal Distribution Only]
Reviewed-by: Monk Liu
_
Monk Liu|GPU Virtualization Team |AMD
-Original Message-
From: amd-gfx On Behalf Of Bokun Zhang
Sent: Saturday, May 2, 2020 1:48 AM
To: amd-gfx@lists.freedesktop.org
Cc
necessary
_
Monk Liu|GPU Virtualization Team |AMD
-Original Message-
From: Das, Nirmoy
Sent: Thursday, August 27, 2020 11:19 PM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Liu, Monk
; Gu, JiaWei (Will) ; Deng, Emily
; Das, Nirmoy
Subject: [PATCH 1/1] drm/amdgpu:
[AMD Official Use Only - Internal Distribution Only]
>> The patch was to fix the previous inconsistent optimization patch.
If so that's a good reason, I'll take a look , thanks
-邮件原件-
发件人: Das, Nirmoy
发送时间: 2020年8月28日 15:46
收件人: Liu, Monk ; Das, Nirmoy ;
amd-gfx@lists
[AMD Official Use Only - Internal Distribution Only]
See that we already have such logic:
282 static void amdgpu_vm_bo_relocated(struct amdgpu_vm_bo_base *vm_bo)
283 {
284 if (vm_bo->bo->parent)
285 list_move(&vm_bo->vm_status, &vm_bo->vm->relocated);
286 else
287 amd
[AMD Official Use Only - Internal Distribution Only]
Please split the patches to 3~4 patches and each with only one purpose
Please follow the open source kernel rules when you prepare a patch: don't
squash two (or more) things together
Thanks
_
Monk Liu|GPU
[AMD Official Use Only - Internal Distribution Only]
Those three patches are
Reviewed-by: Monk Liu
_
Monk Liu|GPU Virtualization Team |AMD
-Original Message-
From: amd-gfx On Behalf Of Bokun Zhang
Sent: Wednesday, September 16, 2020 10:57 PM
To
[AMD Official Use Only - Internal Distribution Only]
Hi Emily
There is already a amdgpu_mcbp parameter there, can you try to leverage that
one ?
e.g.:
we refactor our driver's code and reduce the checking logic from "if
(amdgpu_mcbp || amdgpu_sriov_vf(adev))" to something like "if( amdgpu_mc
[AMD Official Use Only - Internal Distribution Only]
Looks you missed many places, e.g.:
866 if (amdgpu_mcbp || amdgpu_sriov_vf(adev)) {
867 bo_va = fpriv->csa_va;
868 BUG_ON(!bo_va);
869 r = amdgpu_vm_bo_update(adev, bo_va, false);
870 if (r)
87
Yeah, Let's have a deep discussion regarding RLCG logic
-邮件原件-
发件人: Zhang, Hawking
发送时间: 2020年9月22日 10:04
收件人: Zhang, Hawking ; Khaire, Rohit
; amd-gfx@lists.freedesktop.org; Liu, Monk
抄送: Xiao, Jack ; Xu, Feifei ; Wang,
Kevin(Yang) ; Li, Rong (Zero) ; Min,
Frank ; Yuan, Xi
L1 blocked registers are accessed through RLCG path
)
-邮件原件-
发件人: amd-gfx 代表 Liu, Monk
发送时间: 2020年9月22日 10:54
收件人: Zhang, Hawking ; Khaire, Rohit
; amd-gfx@lists.freedesktop.org
抄送: Xiao, Jack ; Xu, Feifei ; Wang,
Kevin(Yang) ; Li, Rong (Zero) ; Min,
Frank ; Yuan, Xiaojie
主题: 回复
AM
To: Zhang, Hawking ; amd-gfx@lists.freedesktop.org; Liu,
Monk ; Chen, JingWen ; Deucher,
Alexander ; Deng, Emily
Subject: RE: [PATCH] amdgpu/sriov Stop data exchange for wholegpu reset
[AMD Official Use Only - Internal Distribution Only]
Ping...
-Original Message-
From: Zhang, Jack
[AMD Official Use Only - Internal Distribution Only]
Reviewed-by: Monk Liu
Thanks
--
Monk Liu | Cloud-GPU Core team
--
-Original Message-
From: amd-gfx On Behalf Of Jingwen Chen
Sent: Monday, January 18,
: amd-gfx@lists.freedesktop.org; Liu, Monk
Cc: Chen, JingWen
Subject: RE: [PATCH] drm/amd/amdgpu: add error handling to
amdgpu_virt_read_pf2vf_data
[AMD Official Use Only - Approved for External Use]
Ping
Best Regards,
JingWen Chen
-Original Message-
From: Jingwen Chen
Sent: Tuesday
: Jingwen Chen
Sent: Thursday, February 25, 2021 1:27 PM
To: amd-gfx@lists.freedesktop.org
Cc: Liu, Monk ; Koenig, Christian ;
Chen, JingWen
Subject: [PATCH 1/2] drm: add a flag to indicate job is resubmitted
Add a flag in drm_sched_job to indicate the job resubmit.
Signed-off-by: Jingwen Chen
: Thursday, February 25, 2021 4:08 PM
To: Chen, JingWen ; amd-gfx@lists.freedesktop.org
Cc: Liu, Monk
Subject: Re: [PATCH 2/2] drm/amd/amdgpu: force flush resubmit job
Good catch, but the approach for the fix is incorrect.
The device reset count can only be incremented after taking the reset lock and
[AMD Public Use]
Hi all
NAVI2X project hit a really hard to solve issue now, and it is turned out to
be a general headache of our TDR mechanism , check below scenario:
1. There is a job1 running on compute1 ring at timestamp
2. There is a job2 running on gfx ring at timestamp
3. Job1
[AMD Official Use Only - Internal Distribution Only]
See in line
Thanks
--
Monk Liu | Cloud-GPU Core team
--
From: Koenig, Christian
Sent: Friday, February 26, 2021 3:58 PM
To: Liu, Monk ; amd-gfx
;job_list_lock);
298
299 job->sched->ops->timedout_job(job);
Thanks
--
Monk Liu | Cloud-GPU Core team
------
From: Liu, Monk
Sent: Friday, February 26, 2021 7:54 PM
To: Koenig, Christian ; amd-gfx@lists.f
er job timeout
l 1 – legacy TDR behave
l 2 – enhanced TDR behave (the one suggested here)
发件人: Koenig, Christian
发送时间: 2021年2月26日 20:05
收件人: Liu, Monk ; amd-gfx@lists.freedesktop.org
抄送: Zhang, Andy ; Chen, Horace ;
Zhang, Jack (Jian)
主题: Re: [RFC] a new approach to detect which ring is the real
/process is
really harmed
Hope above example helps
Thanks
发件人: Grodzovsky, Andrey
发送时间: 2021年2月27日 0:50
收件人: Liu, Monk ; Koenig, Christian
; amd-gfx@lists.freedesktop.org
抄送: Zhang, Andy ; Chen, Horace ;
Zhang, Jack (Jian)
主题: Re: [RFC] a new approach to detect which ring is the real black
[AMD Official Use Only - Internal Distribution Only]
Fix typo:
But in fact this job2 is innocent, and we should insert it back after recovery
, and due to it was already deleted this innocent job’s context/process is
really harmed
发件人: Liu, Monk
发送时间: 2021年2月27日 11:56
收件人: Grodzovsky, Andrey
list
But they are all abandoned in drm_sched_resubmit_jobs() due to they are all
processed by drm_sched_increase_karma()
发件人: Grodzovsky, Andrey
发送时间: 2021年2月28日 8:55
收件人: Liu, Monk ; Koenig, Christian
; amd-gfx@lists.freedesktop.org
抄送: Zhang, Andy ; Chen, Horace ;
Zhang, Jack (Jian)
主题: Re:
r,
Alexander ; Xiao, Jack ; Zhang,
Hawking ; Liu, Monk ; Xu, Feifei
; Wang, Kevin(Yang) ; Xiaojie Yuan
Subject: [PATCH] drm/amdgpu: enable one vf mode on sienna cichlid vf
sienna cichlid needs one vf mode which allows vf to set and get clock status
from guest vm. So now expose the required interf
@lists.freedesktop.org
Cc: Grodzovsky, Andrey ; Quan, Evan
; Chen, Horace ; Tuikov, Luben
; Koenig, Christian ; Deucher,
Alexander ; Xiao, Jack ; Zhang,
Hawking ; Liu, Monk ; Xu, Feifei
; Wang, Kevin(Yang) ; Xiaojie Yuan
Subject: [PATCH] drm/amdgpu: enable one vf mode on sienna cichlid vf
sienna
-GPU Core team
--
-Original Message-
From: Koenig, Christian
Sent: Sunday, March 7, 2021 3:03 AM
To: Zhang, Jack (Jian) ; amd-gfx@lists.freedesktop.org;
Grodzovsky, Andrey ; Liu, Monk ;
Deng, Emily
Subject: Re: [PATCH v2] drm/amd/amdgpu implement
at do you think ?
Thanks
--
Monk Liu | Cloud-GPU Core team
--
-Original Message-
From: Koenig, Christian
Sent: Monday, March 8, 2021 3:53 PM
To: Liu, Monk ; Zhang, Jack (Jian) ;
amd-gfx@lists.freedeskto
Reviewed-by: Monk Liu
_
Monk Liu|GPU Virtualization Team |AMD
-Original Message-
From: amd-gfx On Behalf Of Gavin Wan
Sent: Friday, May 22, 2020 3:53 AM
To: amd-gfx@lists.freedesktop.org
Cc: Wan, Gavin
Subject: [PATCH] drm/amd/amdgpu: Fix the CGCG
Sounds a good idea
@Wan, Gavin can you try hawking's advice ?
_
Monk Liu|GPU Virtualization Team |AMD
-Original Message-
From: amd-gfx On Behalf Of Zhang,
Hawking
Sent: Friday, May 22, 2020 1:09 PM
To: Chen, Guchun ; Wan, Gavin ;
amd-gfx@lists.free
Gavin
Looks the only place you need to change is the part of avoid touching
"CP_INT_CNTL_RING0" which is handled by GIM now
Others looks not needed at all
_
Monk Liu|GPU Virtualization Team |AMD
-Original Message-
From: amd-gfx On Beh
This one looks better
You can put my RB
Thanks
_
Monk Liu|GPU Virtualization Team |AMD
-Original Message-
From: amd-gfx On Behalf Of Wan, Gavin
Sent: Saturday, May 23, 2020 3:41 AM
To: Alex Deucher
Cc: amd-gfx list
Subject: RE: [PATCH] drm/amd/a
Ack-by: Monk.liu
_
Monk Liu|GPU Virtualization Team |AMD
-Original Message-
From: amd-gfx On Behalf Of Emily Deng
Sent: Tuesday, June 2, 2020 8:40 PM
To: amd-gfx@lists.freedesktop.org
Cc: Deng, Emily
Subject: [PATCH] drm/amdgpu/sriov: Disable pm fo
Ack-by: Monk.liu
_
Monk Liu|GPU Virtualization Team |AMD
-Original Message-
From: amd-gfx On Behalf Of Emily Deng
Sent: Thursday, June 11, 2020 10:29 AM
To: amd-gfx@lists.freedesktop.org
Cc: Deng, Emily
Subject: [PATCH] drm/amdgpu/sriov: Add clear v
Acked-by: Monk.Liu
_
Monk Liu|GPU Virtualization Team |AMD
-Original Message-
From: amd-gfx On Behalf Of Emily Deng
Sent: Thursday, June 11, 2020 2:02 PM
To: amd-gfx@lists.freedesktop.org
Cc: Deng, Emily
Subject: [PATCH] drm/amdgpu/sriov: Need to cle
_
Monk Liu|GPU Virtualization Team |AMD
-----Original Message-
From: Christian König
Sent: Monday, June 29, 2020 4:18 PM
To: Liu, Monk ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: make IB test synchronize with init for SRIOV
Am 29.06
Sounds doable as well
_
Monk Liu|GPU Virtualization Team |AMD
-Original Message-
From: Christian König
Sent: Monday, June 29, 2020 5:10 PM
To: Liu, Monk ; Koenig, Christian ;
amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: make IB
stian König
Sent: Monday, June 29, 2020 5:44 PM
To: Liu, Monk ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: make IB test synchronize with init for
SRIOV(v2)
Am 29.06.20 um 11:35 schrieb Monk Liu:
> issue:
> originally we kickoff IB test asynchronously with driver's init
Team |AMD
-Original Message-
From: Kuehling, Felix
Sent: Tuesday, July 28, 2020 7:33 AM
To: amd-gfx@lists.freedesktop.org; Liu, Monk
Subject: Re: [PATCH] drm/amdgpu: introduce a new parameter to configure how
many KCQ we want(v2)
Am 2020-07-27 um 6:47 a.m. schrieb Monk Liu:
> w
[AMD Official Use Only - Internal Distribution Only]
I repeated the patch broadcast through git-send-email
_
Monk Liu|GPU Virtualization Team |AMD
-Original Message-
From: Koenig, Christian
Sent: Tuesday, July 28, 2020 5:04 PM
To: Liu, Monk ; amd
[AMD Official Use Only - Internal Distribution Only]
Ping
_
Monk Liu|GPU Virtualization Team |AMD
-Original Message-
From: amd-gfx On Behalf Of Liu, Monk
Sent: Tuesday, July 28, 2020 5:32 PM
To: Koenig, Christian ; amd-...@freedesktop.org
Cc
your side soon
_
Monk Liu|GPU Virtualization Team |AMD
-Original Message-
From: Kuehling, Felix
Sent: Tuesday, July 28, 2020 10:38 PM
To: Liu, Monk ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: introduce a new parameter to configure ho
[AMD Official Use Only - Internal Distribution Only]
Reviewed-by: Monk Liu
_
Monk Liu|GPU Virtualization Team |AMD
-Original Message-
From: amd-gfx On Behalf Of Bokun Zhang
Sent: Wednesday, October 28, 2020 6:02 AM
To: amd-gfx@lists.freedesktop.org
1 - 100 of 724 matches
Mail list logo