Re: [PATCH 2/2] drm/amdgpu: Do core dump immediately when job tmo

2024-08-19 Thread Khatri, Sunil
On 8/19/2024 3:23 PM, trigger.hu...@amd.com wrote: From: Trigger Huang Do the coredump immediately after a job timeout to get a closer representation of GPU's error status. V2: This will skip printing vram_lost as the GPU reset is not happened yet (Alex) V3: Unconditionally call the core du

Re: [PATCH 2/2] drm/amdgpu: Do core dump immediately when job tmo

2024-08-20 Thread Khatri, Sunil
On 8/20/2024 7:36 PM, Alex Deucher wrote: On Tue, Aug 20, 2024 at 3:30 AM Huang, Trigger wrote: [AMD Official Use Only - AMD Internal Distribution Only] -Original Message- From: Khatri, Sunil Sent: Monday, August 19, 2024 6:31 PM To: Huang, Trigger ; amd-gfx@lists.freedesktop.org

Re: [PATCH 2/2] drm/amdgpu: Do core dump immediately when job tmo

2024-08-20 Thread Khatri, Sunil
On 8/20/2024 1:00 PM, Huang, Trigger wrote: [AMD Official Use Only - AMD Internal Distribution Only] -Original Message- From: Khatri, Sunil Sent: Monday, August 19, 2024 6:31 PM To: Huang, Trigger ; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander Subject: Re: [PATCH 2/2] drm

Re: [PATCH 2/2] drm/amdgpu: Do core dump immediately when job tmo

2024-08-20 Thread Khatri, Sunil
On 8/20/2024 9:31 PM, Alex Deucher wrote: On Tue, Aug 20, 2024 at 11:31 AM Khatri, Sunil wrote: On 8/20/2024 1:00 PM, Huang, Trigger wrote: [AMD Official Use Only - AMD Internal Distribution Only] -Original Message- From: Khatri, Sunil Sent: Monday, August 19, 2024 6:31 PM To

Re: [PATCH v4 2/2] drm/amdgpu: Do core dump immediately when job tmo

2024-08-21 Thread Khatri, Sunil
Acked-by: Sunil Khatri On 8/21/2024 2:08 PM, trigger.hu...@amd.com wrote: From: Trigger Huang Do the coredump immediately after a job timeout to get a closer representation of GPU's error status. V2: This will skip printing vram_lost as the GPU reset is not happ

Re: [PATCH] drm/amdgpu: add ring timeout information in devcoredump

2024-03-05 Thread Khatri, Sunil
On 3/5/2024 2:53 PM, Christian König wrote: > Am 01.03.24 um 13:43 schrieb Sunil Khatri: >> Add ring timeout related information in the amdgpu >> devcoredump file for debugging purposes. >> >> During the gpu recovery process the registered call >> is triggered and add the debug information in data

Re: [PATCH v2] drm/amdgpu: add ring timeout information in devcoredump

2024-03-05 Thread Khatri, Sunil
On 3/5/2024 6:40 PM, Christian König wrote: Am 05.03.24 um 12:58 schrieb Sunil Khatri: Add ring timeout related information in the amdgpu devcoredump file for debugging purposes. During the gpu recovery process the registered call is triggered and add the debug information in data file create

Re: [PATCH] drm/amdgpu: cache in more vm fault information

2024-03-06 Thread Khatri, Sunil
On 3/6/2024 3:37 PM, Christian König wrote: Am 06.03.24 um 10:04 schrieb Sunil Khatri: When an  page fault interrupt is raised there is a lot more information that is useful for developers to analyse the pagefault. Well actually those information are not that interesting  because they are h

Re: [PATCH] drm/amdgpu: cache in more vm fault information

2024-03-06 Thread Khatri, Sunil
On 3/6/2024 6:12 PM, Christian König wrote: Am 06.03.24 um 11:40 schrieb Khatri, Sunil: On 3/6/2024 3:37 PM, Christian König wrote: Am 06.03.24 um 10:04 schrieb Sunil Khatri: When an  page fault interrupt is raised there is a lot more information that is useful for developers to analyse

Re: [PATCH] drm/amdgpu: cache in more vm fault information

2024-03-06 Thread Khatri, Sunil
On 3/6/2024 8:34 PM, Christian König wrote: Am 06.03.24 um 15:29 schrieb Alex Deucher: On Wed, Mar 6, 2024 at 8:04 AM Khatri, Sunil wrote: On 3/6/2024 6:12 PM, Christian König wrote: Am 06.03.24 um 11:40 schrieb Khatri, Sunil: On 3/6/2024 3:37 PM, Christian König wrote: Am 06.03.24 um

Re: [PATCH] drm/amdgpu: cache in more vm fault information

2024-03-06 Thread Khatri, Sunil
On 3/6/2024 9:19 PM, Alex Deucher wrote: On Wed, Mar 6, 2024 at 10:32 AM Alex Deucher wrote: On Wed, Mar 6, 2024 at 10:13 AM Khatri, Sunil wrote: On 3/6/2024 8:34 PM, Christian König wrote: Am 06.03.24 um 15:29 schrieb Alex Deucher: On Wed, Mar 6, 2024 at 8:04 AM Khatri, Sunil wrote

Re: [PATCH] drm/amdgpu: cache in more vm fault information

2024-03-06 Thread Khatri, Sunil
On 3/6/2024 9:07 PM, Christian König wrote: Am 06.03.24 um 16:13 schrieb Khatri, Sunil: On 3/6/2024 8:34 PM, Christian König wrote: Am 06.03.24 um 15:29 schrieb Alex Deucher: On Wed, Mar 6, 2024 at 8:04 AM Khatri, Sunil wrote: On 3/6/2024 6:12 PM, Christian König wrote: Am 06.03.24 um

Re: [PATCH] drm/amdgpu: cache in more vm fault information

2024-03-06 Thread Khatri, Sunil
On 3/6/2024 9:45 PM, Alex Deucher wrote: On Wed, Mar 6, 2024 at 11:06 AM Khatri, Sunil wrote: On 3/6/2024 9:07 PM, Christian König wrote: Am 06.03.24 um 16:13 schrieb Khatri, Sunil: On 3/6/2024 8:34 PM, Christian König wrote: Am 06.03.24 um 15:29 schrieb Alex Deucher: On Wed, Mar 6

Re: [PATCH] drm/amdgpu: cache in more vm fault information

2024-03-06 Thread Khatri, Sunil
On 3/6/2024 9:49 PM, Christian König wrote: Am 06.03.24 um 17:06 schrieb Khatri, Sunil: On 3/6/2024 9:07 PM, Christian König wrote: Am 06.03.24 um 16:13 schrieb Khatri, Sunil: On 3/6/2024 8:34 PM, Christian König wrote: Am 06.03.24 um 15:29 schrieb Alex Deucher: On Wed, Mar 6, 2024 at 8

Re: [PATCH] drm/amdgpu: cache in more vm fault information

2024-03-06 Thread Khatri, Sunil
On 3/6/2024 9:59 PM, Alex Deucher wrote: On Wed, Mar 6, 2024 at 11:21 AM Khatri, Sunil wrote: On 3/6/2024 9:45 PM, Alex Deucher wrote: On Wed, Mar 6, 2024 at 11:06 AM Khatri, Sunil wrote: On 3/6/2024 9:07 PM, Christian König wrote: Am 06.03.24 um 16:13 schrieb Khatri, Sunil: On 3/6

Re: [PATCH] drm/amdgpu: cache in more vm fault information

2024-03-06 Thread Khatri, Sunil
, we just need to provide faulting address, Fault status register with gpu family to decode the fault along with process information. Regards Sunil Khatri On 3/6/2024 9:56 PM, Khatri, Sunil wrote: On 3/6/2024 9:49 PM, Christian König wrote: Am 06.03.24 um 17:06 schrieb Khatri, Sunil: On 3/6

RE: [PATCH] drm/amdgpu: cache in more vm fault information

2024-03-06 Thread Khatri, Sunil
...@vger.kernel.org; Joshi, Mukul ; Paneer Selvam, Arunpravin ; Khatri, Sunil Subject: [PATCH] drm/amdgpu: cache in more vm fault information When an page fault interrupt is raised there is a lot more information that is useful for developers to analyse the pagefault. Add all such information

Re: [PATCH] drm/amdgpu: add vm fault information to devcoredump

2024-03-06 Thread Khatri, Sunil
...@vger.kernel.org; Joshi, Mukul ; Paneer Selvam, Arunpravin ; Khatri, Sunil Subject: [PATCH] drm/amdgpu: add vm fault information to devcoredump Add page fault information to the devcoredump. Output of devcoredump: AMDGPU Device Coredump version: 1 kernel: 6.7.0-amd-staging-drm-next module

Re: [PATCH] drm/amdgpu: add vm fault information to devcoredump

2024-03-07 Thread Khatri, Sunil
On 3/7/2024 1:47 PM, Christian König wrote: Am 06.03.24 um 19:19 schrieb Sunil Khatri: Add page fault information to the devcoredump. Output of devcoredump: AMDGPU Device Coredump version: 1 kernel: 6.7.0-amd-staging-drm-next module: amdgpu time: 29.725011811 process_name: soft_reco

Re: [PATCH] drm/amdgpu: add vm fault information to devcoredump

2024-03-07 Thread Khatri, Sunil
On 3/7/2024 6:10 PM, Christian König wrote: Am 07.03.24 um 09:37 schrieb Khatri, Sunil: On 3/7/2024 1:47 PM, Christian König wrote: Am 06.03.24 um 19:19 schrieb Sunil Khatri: Add page fault information to the devcoredump. Output of devcoredump: AMDGPU Device Coredump version: 1

Re: [PATCH 2/2] drm/amdgpu: add vm fault information to devcoredump

2024-03-07 Thread Khatri, Sunil
On 3/8/2024 12:44 AM, Alex Deucher wrote: On Thu, Mar 7, 2024 at 12:00 PM Sunil Khatri wrote: Add page fault information to the devcoredump. Output of devcoredump: AMDGPU Device Coredump version: 1 kernel: 6.7.0-amd-staging-drm-next module: amdgpu time: 29.725011811 process_name: s

Re: [PATCH v2 2/2] drm/amdgpu: add vm fault information to devcoredump

2024-03-08 Thread Khatri, Sunil
On 3/8/2024 2:39 PM, Christian König wrote: Am 07.03.24 um 21:50 schrieb Sunil Khatri: Add page fault information to the devcoredump. Output of devcoredump: AMDGPU Device Coredump version: 1 kernel: 6.7.0-amd-staging-drm-next module: amdgpu time: 29.725011811 process_name: soft_reco

RE: [PATCH] drm/amdgpu: add all ringbuffer information in devcoredump

2024-03-11 Thread Khatri, Sunil
-ker...@vger.kernel.org; Khatri, Sunil Subject: [PATCH] drm/amdgpu: add all ringbuffer information in devcoredump Add ringbuffer information such as: rptr, wptr, ring name, ring size and also the ring contents for each ring on a gpu reset. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd

Re: [PATCH] drm/amdgpu: add ring buffer information in devcoredump

2024-03-11 Thread Khatri, Sunil
On 3/11/2024 7:29 PM, Christian König wrote: Am 11.03.24 um 13:22 schrieb Sunil Khatri: Add relevant ringbuffer information such as rptr, wptr, ring name, ring size and also the ring contents for each ring on a gpu reset. Signed-off-by: Sunil Khatri ---   drivers/gpu/drm/amd/amdgpu/amdgpu_

Re: [PATCH 1/2] drm/amdgpu: add the IP information of the soc

2024-03-13 Thread Khatri, Sunil
md-gfx@lists.freedesktop.org ; dri-de...@lists.freedesktop.org ; linux-ker...@vger.kernel.org ; Khatri, Sunil Subject: [PATCH 1/2] drm/amdgpu: add the IP information of the soc Add all the IP's information on a SOC to the devcoredump. Signed-off-by: Sunil Khatri --- drivers/gpu/drm

Re: [PATCH 1/2] drm/amdgpu: add the IP information of the soc

2024-03-13 Thread Khatri, Sunil
md-gfx@lists.freedesktop.org ; dri-de...@lists.freedesktop.org ; linux-ker...@vger.kernel.org ; Khatri, Sunil Subject: [PATCH 1/2] drm/amdgpu: add the IP information of the soc Add all the IP's information on a SOC to the devcoredump. Signed-off-by: Sunil Khatri --- drivers/gpu/drm

Re: [PATCH 2/2] drm:amdgpu: add firmware information of all IP's

2024-03-13 Thread Khatri, Sunil
md-gfx@lists.freedesktop.org ; dri-de...@lists.freedesktop.org ; linux-ker...@vger.kernel.org ; Khatri, Sunil Subject: [PATCH 2/2] drm:amdgpu: add firmware information of all IP's Add firmware version information of each IP and each instance where applicable. Signed-off-by: Sunil Khatri --- drivers/

Re: [PATCH 1/2] drm/amdgpu: add the IP information of the soc

2024-03-13 Thread Khatri, Sunil
On 3/14/2024 1:58 AM, Alex Deucher wrote: On Tue, Mar 12, 2024 at 8:41 AM Sunil Khatri wrote: Add all the IP's information on a SOC to the devcoredump. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 19 +++ 1 file changed, 19 insertions(+) di

Re: [PATCH 2/2] drm:amdgpu: add firmware information of all IP's

2024-03-13 Thread Khatri, Sunil
On 3/14/2024 2:06 AM, Alex Deucher wrote: On Tue, Mar 12, 2024 at 8:42 AM Sunil Khatri wrote: Add firmware version information of each IP and each instance where applicable. Is there a way we can share some common code with devcoredump, debugfs, and the info IOCTL? All three places need to

Re: [PATCH 2/2] drm:amdgpu: add firmware information of all IP's

2024-03-14 Thread Khatri, Sunil
On 3/14/2024 11:40 AM, Sharma, Shashank wrote: On 14/03/2024 06:58, Khatri, Sunil wrote: On 3/14/2024 2:06 AM, Alex Deucher wrote: On Tue, Mar 12, 2024 at 8:42 AM Sunil Khatri wrote: Add firmware version information of each IP and each instance where applicable. Is there a way we can

Re: [PATCH 1/2] drm/amdgpu: add the IP information of the soc

2024-03-14 Thread Khatri, Sunil
On 3/14/2024 8:12 PM, Alex Deucher wrote: On Thu, Mar 14, 2024 at 1:44 AM Khatri, Sunil wrote: On 3/14/2024 1:58 AM, Alex Deucher wrote: On Tue, Mar 12, 2024 at 8:41 AM Sunil Khatri wrote: Add all the IP's information on a SOC to the devcoredump. Signed-off-by: Sunil K

RE: [PATCH] drm/amdgpu: add the hw_ip version of all IP's

2024-03-15 Thread Khatri, Sunil
; linux-ker...@vger.kernel.org; Khatri, Sunil Subject: [PATCH] drm/amdgpu: add the hw_ip version of all IP's Add all the IP's version information on a SOC to the devcoredump. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 62 +++ 1 fi

Re: [PATCH] drm/amdgpu: add the hw_ip version of all IP's

2024-03-15 Thread Khatri, Sunil
On 3/15/2024 6:45 PM, Alex Deucher wrote: On Fri, Mar 15, 2024 at 8:13 AM Sunil Khatri wrote: Add all the IP's version information on a SOC to the devcoredump. Signed-off-by: Sunil Khatri This looks great. Reviewed-by: Alex Deucher Thanks Alex --- drivers/gpu/drm/amd/amdgpu/amdgp

Re: [bug report] drm/amdgpu: add ring buffer information in devcoredump

2024-03-15 Thread Khatri, Sunil
Thanks for pointing these. I do have some doubt and i raised inline. On 3/15/2024 8:46 PM, Dan Carpenter wrote: Hello Sunil Khatri, Commit 42742cc541bb ("drm/amdgpu: add ring buffer information in devcoredump") from Mar 11, 2024 (linux-next), leads to the following Smatch static checker warning

RE: [bug report] drm/amdgpu: add ring buffer information in devcoredump

2024-03-17 Thread Khatri, Sunil
[AMD Official Use Only - General] Got it. Thanks for reported that. Sent the patch for review. Regards Sunil khatri -Original Message- From: Dan Carpenter Sent: Saturday, March 16, 2024 2:42 PM To: Khatri, Sunil Cc: Khatri, Sunil ; Koenig, Christian ; Deucher, Alexander ; amd-gfx

Re: [PATCH] drm/amdgpu: refactor code to reuse system information

2024-03-19 Thread Khatri, Sunil
Validated the code by using the function in same way as ioctl would use in devcoredump and getting the valid values. Also this would be the container of the information that we need to share between ioctl, debugfs and devcoredump and keep updating this based on information needed. On 3/19/2

Re: [PATCH] drm/amdgpu: refactor code to reuse system information

2024-03-19 Thread Khatri, Sunil
On 3/19/2024 7:19 PM, Lazar, Lijo wrote: On 3/19/2024 6:02 PM, Sunil Khatri wrote: Refactor the code so debugfs and devcoredump can reuse the common information and avoid unnecessary copy of it. created a new file which would be the right place to hold functions which will be used between sy

Re: [PATCH] drm/amdgpu: refactor code to reuse system information

2024-03-19 Thread Khatri, Sunil
On 3/19/2024 7:43 PM, Lazar, Lijo wrote: On 3/19/2024 7:27 PM, Khatri, Sunil wrote: On 3/19/2024 7:19 PM, Lazar, Lijo wrote: On 3/19/2024 6:02 PM, Sunil Khatri wrote: Refactor the code so debugfs and devcoredump can reuse the common information and avoid unnecessary copy of it. created a

Re: [PATCH v2] drm/amdgpu: refactor code to reuse system information

2024-03-19 Thread Khatri, Sunil
On 3/19/2024 8:07 PM, Christian König wrote: Am 19.03.24 um 15:25 schrieb Sunil Khatri: Refactor the code so debugfs and devcoredump can reuse the common information and avoid unnecessary copy of it. created a new file which would be the right place to hold functions which will be used betwee

Re: [PATCH] drm/amdgpu: refactor code to reuse system information

2024-03-19 Thread Khatri, Sunil
Sent a new patch based on discussion with Alex. On 3/19/2024 8:34 PM, Christian König wrote: Am 19.03.24 um 15:59 schrieb Alex Deucher: On Tue, Mar 19, 2024 at 10:56 AM Christian König wrote: Am 19.03.24 um 15:26 schrieb Alex Deucher: On Tue, Mar 19, 2024 at 8:32 AM Sunil Khatri wrote: Ref

RE: [PATCH v2] drm/amdgpu: refactor code to reuse system information

2024-03-19 Thread Khatri, Sunil
...@vger.kernel.org; Zhang, Hawking ; Kuehling, Felix ; Lazar, Lijo ; Khatri, Sunil Subject: [PATCH v2] drm/amdgpu: refactor code to reuse system information Refactor the code so debugfs and devcoredump can reuse the common information and avoid unnecessary copy of it. created a new file which

Re: [PATCH] drm/amdgpu: add support of bios dump in devcoredump

2024-03-26 Thread Khatri, Sunil
On 3/26/2024 10:23 PM, Alex Deucher wrote: On Tue, Mar 26, 2024 at 10:38 AM Sunil Khatri wrote: dump the bios binary in the devcoredump. Signed-off-by: Sunil Khatri --- .../gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c | 20 +++ 1 file changed, 20 insertions(+) diff --git a/

Re: [PATCH] drm/amdgpu: add IP's FW information to devcoredump

2024-03-27 Thread Khatri, Sunil
On 3/28/2024 8:38 AM, Alex Deucher wrote: On Tue, Mar 26, 2024 at 1:31 PM Sunil Khatri wrote: Add FW information of all the IP's in the devcoredump. Signed-off-by: Sunil Khatri Might want to include the vbios version info as well, e.g., atom_context->name atom_context->vbios_pn atom_contex

RE: [PATCH 2/2] drm/amdgpu: Add support of gfx10 register dump

2024-04-12 Thread Khatri, Sunil
[AMD Official Use Only - General] Ignore sent by mistake. -Original Message- From: Sunil Khatri Sent: Friday, April 12, 2024 2:30 PM To: Deucher, Alexander ; Koenig, Christian Cc: amd-gfx@lists.freedesktop.org; Khatri, Sunil Subject: [PATCH 2/2] drm/amdgpu: Add support of gfx10

RE: [PATCH 0/2] First set in IP dump patches

2024-04-12 Thread Khatri, Sunil
[AMD Official Use Only - General] Ignore the series sent by mistake -Original Message- From: Sunil Khatri Sent: Friday, April 12, 2024 2:30 PM To: Deucher, Alexander ; Koenig, Christian Cc: amd-gfx@lists.freedesktop.org; Khatri, Sunil Subject: [PATCH 0/2] First set in IP dump

Re: [PATCH v2 2/2] drm/amdgpu: Add support of gfx10 register dump

2024-04-12 Thread Khatri, Sunil
On 4/12/2024 8:50 PM, Alex Deucher wrote: I would split this into two patches, one to add the core infrastructure in devcoredump and one to add gfx10 support. The core support could be squashed into patch 1 as well. Sure would push the v3 with the changes. Regards Sunil

Re: [PATCH v2 2/2] drm/amdgpu: Add support of gfx10 register dump

2024-04-12 Thread Khatri, Sunil
On 4/12/2024 8:50 PM, Alex Deucher wrote: On Fri, Apr 12, 2024 at 10:00 AM Sunil Khatri wrote: Adding initial set of registers for ipdump during devcoredump starting with gfx10 gc registers. ip dump is triggered when gpu reset happens via devcoredump and the memory is allocated by each ip an

Re: [PATCH v2 2/2] drm/amdgpu: Add support of gfx10 register dump

2024-04-12 Thread Khatri, Sunil
On 4/12/2024 10:42 PM, Alex Deucher wrote: On Fri, Apr 12, 2024 at 1:05 PM Khatri, Sunil wrote: On 4/12/2024 8:50 PM, Alex Deucher wrote: On Fri, Apr 12, 2024 at 10:00 AM Sunil Khatri wrote: Adding initial set of registers for ipdump during devcoredump starting with gfx10 gc registers

RE: [PATCH v2 2/2] drm/amdgpu: Add support of gfx10 register dump

2024-04-12 Thread Khatri, Sunil
[AMD Official Use Only - General] -Original Message- From: Alex Deucher Sent: Saturday, April 13, 2024 1:56 AM To: Khatri, Sunil Cc: Khatri, Sunil ; Deucher, Alexander ; Koenig, Christian ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH v2 2/2] drm/amdgpu: Add support of gfx10

RE: [PATCH v1 3/3] drm/amdgpu: select compute ME engines dynamically

2024-07-09 Thread Khatri, Sunil
[AMD Official Use Only - AMD Internal Distribution Only] Thanks Alex -Original Message- From: Alex Deucher Sent: Tuesday, July 9, 2024 7:27 PM To: Khatri, Sunil Cc: Deucher, Alexander ; Koenig, Christian ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH v1 3/3] drm/amdgpu: select

RE: [PATCH v1 0/2] SDMA v5_2 ip dump support for devcoredump

2024-07-12 Thread Khatri, Sunil
[AMD Official Use Only - AMD Internal Distribution Only] Ignore Plz -Original Message- From: Sunil Khatri Sent: Friday, July 12, 2024 5:23 PM To: Deucher, Alexander ; Koenig, Christian Cc: amd-gfx@lists.freedesktop.org; Khatri, Sunil Subject: [PATCH v1 0/2] SDMA v5_2 ip dump support

Re: [PATCH v1 2/3] drm/amdgpu: add vcn_v3_0 ip dump support

2024-07-24 Thread Khatri, Sunil
On 7/24/2024 9:33 PM, Alex Deucher wrote: On Wed, Jul 24, 2024 at 7:33 AM Sunil Khatri wrote: Add support of vcn ip dump in the devcoredump for vcn_v3_0. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c | 149 +- 1 file changed, 148 insertions

Re: [PATCH v1 2/3] drm/amdgpu: add vcn_v3_0 ip dump support

2024-07-24 Thread Khatri, Sunil
On 7/25/2024 10:58 AM, Gopalakrishnan, Veerabadhran (Veera) wrote: [AMD Official Use Only - AMD Internal Distribution Only] -Original Message- From: amd-gfx On Behalf Of Khatri, Sunil Sent: Wednesday, July 24, 2024 10:00 PM To: Alex Deucher ; Khatri, Sunil Cc: Deucher, Alexander

Re: [PATCH 2/2] drm/amdgpu: trigger ip dump before suspend of IP's

2024-07-26 Thread Khatri, Sunil
On 7/26/2024 7:18 PM, Lazar, Lijo wrote: On 7/26/2024 6:42 PM, Alex Deucher wrote: On Fri, Jul 26, 2024 at 8:48 AM Sunil Khatri wrote: Problem: IP dump right now is done post suspend of all IP's which for some IP's could change power state and software state too which we do not want to refl

Re: [PATCH 2/2] drm/amdgpu: trigger ip dump before suspend of IP's

2024-07-26 Thread Khatri, Sunil
On 7/26/2024 7:53 PM, Khatri, Sunil wrote: On 7/26/2024 7:18 PM, Lazar, Lijo wrote: On 7/26/2024 6:42 PM, Alex Deucher wrote: On Fri, Jul 26, 2024 at 8:48 AM Sunil Khatri wrote: Problem: IP dump right now is done post suspend of all IP's which for some IP's could change power

Re: [PATCH 2/2] drm/amdgpu: trigger ip dump before suspend of IP's

2024-07-26 Thread Khatri, Sunil
On 7/26/2024 8:36 PM, Lazar, Lijo wrote: On 7/26/2024 8:11 PM, Khatri, Sunil wrote: On 7/26/2024 7:53 PM, Khatri, Sunil wrote: On 7/26/2024 7:18 PM, Lazar, Lijo wrote: On 7/26/2024 6:42 PM, Alex Deucher wrote: On Fri, Jul 26, 2024 at 8:48 AM Sunil Khatri wrote: Problem: IP dump right

Re: [PATCH 2/2] drm/amdgpu: trigger ip dump before suspend of IP's

2024-07-26 Thread Khatri, Sunil
On 7/27/2024 12:13 AM, Alex Deucher wrote: On Fri, Jul 26, 2024 at 1:16 PM Khatri, Sunil wrote: On 7/26/2024 8:36 PM, Lazar, Lijo wrote: On 7/26/2024 8:11 PM, Khatri, Sunil wrote: On 7/26/2024 7:53 PM, Khatri, Sunil wrote: On 7/26/2024 7:18 PM, Lazar, Lijo wrote: On 7/26/2024 6:42 PM

Re: [PATCH 2/2] drm/amdgpu: trigger ip dump before suspend of IP's

2024-07-28 Thread Khatri, Sunil
On 7/29/2024 10:08 AM, Lazar, Lijo wrote: On 7/27/2024 12:51 AM, Khatri, Sunil wrote: On 7/27/2024 12:13 AM, Alex Deucher wrote: On Fri, Jul 26, 2024 at 1:16 PM Khatri, Sunil wrote: On 7/26/2024 8:36 PM, Lazar, Lijo wrote: On 7/26/2024 8:11 PM, Khatri, Sunil wrote: On 7/26/2024 7:53 PM

Re: [PATCH 2/2] drm/amdgpu: trigger ip dump before suspend of IP's

2024-07-29 Thread Khatri, Sunil
On 7/29/2024 11:17 AM, Lazar, Lijo wrote: On 7/29/2024 11:08 AM, Khatri, Sunil wrote: On 7/29/2024 10:08 AM, Lazar, Lijo wrote: On 7/27/2024 12:51 AM, Khatri, Sunil wrote: On 7/27/2024 12:13 AM, Alex Deucher wrote: On Fri, Jul 26, 2024 at 1:16 PM Khatri, Sunil wrote: On 7/26/2024 8:36

RE: [PATCH] drm/amdgpu: add support of burst nop for gfx10

2024-07-29 Thread Khatri, Sunil
Cc: amd-gfx@lists.freedesktop.org; Khatri, Sunil Subject: [PATCH] drm/amdgpu: add support of burst nop for gfx10 Problem: Till now we are adding NOP packet one by one i.e if we need N nop packets for padding we are adding N NOP packets in the ring which does not use the HW efficiently

Re: [PATCH] drm/amdgpu: optimize the padding with hw optimization

2024-07-31 Thread Khatri, Sunil
On 8/1/2024 8:49 AM, Marek Olšák wrote: On Tue, Jul 30, 2024 at 8:43 AM Sunil Khatri wrote: Adding NOP packets one by one in the ring does not use the CP efficiently. Solution: Use CP optimization while adding NOP packet's so PFP can discard NOP packets based on information of count from the

Re: [PATCH] drm/amdgpu: optimize the padding with hw optimization

2024-07-31 Thread Khatri, Sunil
On 8/1/2024 8:52 AM, Marek Olšák wrote: On Wed, Jul 31, 2024 at 11:19 PM Marek Olšák wrote: On Tue, Jul 30, 2024 at 8:43 AM Sunil Khatri wrote: Adding NOP packets one by one in the ring does not use the CP efficiently. Solution: Use CP optimization while adding NOP packet's so PFP can disc

Re: [PATCH] drm/amdgpu: optimize the padding with hw optimization

2024-07-31 Thread Khatri, Sunil
On 8/1/2024 8:54 AM, Marek Olšák wrote: On Tue, Jul 30, 2024 at 8:43 AM Sunil Khatri wrote: Adding NOP packets one by one in the ring does not use the CP efficiently. Solution: Use CP optimization while adding NOP packet's so PFP can discard NOP packets based on information of count from the

Re: [PATCH v1 06/15] drm/amdgpu: add print support for vcn_v4_0_3 ip dump

2024-08-06 Thread Khatri, Sunil
On 8/7/2024 3:02 AM, Alex Deucher wrote: On Tue, Aug 6, 2024 at 4:18 AM Sunil Khatri wrote: Add support for logging the registers in devcoredump buffer for vcn_v4_0_3. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c | 34 - 1 file changed,

Re: [PATCH v1 05/15] drm/amdgpu: add vcn_v4_0_3 ip dump support

2024-08-06 Thread Khatri, Sunil
On 8/7/2024 2:58 AM, Alex Deucher wrote: On Tue, Aug 6, 2024 at 4:18 AM Sunil Khatri wrote: Add support of vcn ip dump in the devcoredump for vcn_v4_0_3. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c | 170 +++- 1 file changed, 169 insertions(+

Re: [PATCH v1 05/15] drm/amdgpu: add vcn_v4_0_3 ip dump support

2024-08-08 Thread Khatri, Sunil
On 8/8/2024 11:20 AM, Lazar, Lijo wrote: On 8/7/2024 2:58 AM, Alex Deucher wrote: On Tue, Aug 6, 2024 at 4:18 AM Sunil Khatri wrote: Add support of vcn ip dump in the devcoredump for vcn_v4_0_3. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c | 170 ++

Re: [PATCH v1 05/15] drm/amdgpu: add vcn_v4_0_3 ip dump support

2024-08-08 Thread Khatri, Sunil
On 8/8/2024 12:44 PM, Lazar, Lijo wrote: On 8/8/2024 12:36 PM, Khatri, Sunil wrote: On 8/8/2024 11:20 AM, Lazar, Lijo wrote: On 8/7/2024 2:58 AM, Alex Deucher wrote: On Tue, Aug 6, 2024 at 4:18 AM Sunil Khatri wrote: Add support of vcn ip dump in the devcoredump for vcn_v4_0_3. Signed

RE: [PATCH v2 1/4] drm/amdgpu: add gfx9_4_3 register support in ipdump

2024-08-13 Thread Khatri, Sunil
[AMD Official Use Only - AMD Internal Distribution Only] -Original Message- From: amd-gfx On Behalf Of Alex Deucher Sent: Tuesday, August 13, 2024 9:40 PM To: Khatri, Sunil Cc: Deucher, Alexander ; Koenig, Christian ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH v2 1/4] drm/amdgpu

Re: [PATCH v3 4/5] drm/amdgpu: enable redirection of irq's for IH V6.0

2024-04-16 Thread Khatri, Sunil
On 4/16/2024 7:56 PM, Alex Deucher wrote: On Tue, Apr 16, 2024 at 9:34 AM Sunil Khatri wrote: Enable redirection of irq for pagefaults for specific clients to avoid overflow without dropping interrupts. So here we redirect the interrupts to another IH ring i.e ring1 where only these interrup

Re: [PATCH 2/6] drm/amdgpu: add support of gfx10 register dump

2024-04-16 Thread Khatri, Sunil
On 4/16/2024 7:25 PM, Alex Deucher wrote: On Tue, Apr 16, 2024 at 8:08 AM Sunil Khatri wrote: Adding gfx10 gc registers to be used for register dump via devcoredump during a gpu reset. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 12 ++ drivers/gpu/drm/a

Re: [PATCH 2/6] drm/amdgpu: add support of gfx10 register dump

2024-04-16 Thread Khatri, Sunil
On 4/16/2024 7:30 PM, Christian König wrote: Am 16.04.24 um 15:55 schrieb Alex Deucher: On Tue, Apr 16, 2024 at 8:08 AM Sunil Khatri wrote: Adding gfx10 gc registers to be used for register dump via devcoredump during a gpu reset. Signed-off-by: Sunil Khatri ---   drivers/gpu/drm/amd/amdgpu

Re: [PATCH 4/6] drm/amdgpu: add support for gfx v10 print

2024-04-16 Thread Khatri, Sunil
On 4/16/2024 7:27 PM, Alex Deucher wrote: On Tue, Apr 16, 2024 at 8:08 AM Sunil Khatri wrote: Add support to print ip information to be used to print registers in devcoredump buffer. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 17 - 1 file chang

Re: [PATCH 6/6] drm/amdgpu: add ip dump for each ip in devcoredump

2024-04-16 Thread Khatri, Sunil
On 4/16/2024 7:29 PM, Alex Deucher wrote: On Tue, Apr 16, 2024 at 8:08 AM Sunil Khatri wrote: Add ip dump for each ip of the asic in the devcoredump for all the ips where a callback is registered for register dump. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_dev_core

Re: [PATCH v2] drm/amdgpu: Skip the coredump collection on reset during driver reload

2024-04-17 Thread Khatri, Sunil
devcoredump is used to debug gpu hangs/resets. So in normal process when there is a hang due to ring timeout or page fault we are doing a hard reset as soft reset fail in those cases. How are we making sure that the devcoredump is triggered in those cases and captured? Regards Sunil Khatri On

Re: [PATCH v2] drm/amdgpu: Skip the coredump collection on reset during driver reload

2024-04-17 Thread Khatri, Sunil
On 4/17/2024 1:06 PM, Khatri, Sunil wrote: devcoredump is used to debug gpu hangs/resets. So in normal process when there is a hang due to ring timeout or page fault we are doing a hard reset as soft reset fail in those cases. How are we making sure that the devcoredump is triggered in those

Re: [PATCH v2] drm/amdgpu: Skip the coredump collection on reset during driver reload

2024-04-17 Thread Khatri, Sunil
On 4/17/2024 1:19 PM, Lazar, Lijo wrote: On 4/17/2024 1:14 PM, Khatri, Sunil wrote: On 4/17/2024 1:06 PM, Khatri, Sunil wrote: devcoredump is used to debug gpu hangs/resets. So in normal process when there is a hang due to ring timeout or page fault we are doing a hard reset as soft reset

Re: [PATCH v4 2/6] drm/amdgpu: add support of gfx10 register dump

2024-04-17 Thread Khatri, Sunil
On 4/17/2024 2:15 PM, Christian König wrote: Am 17.04.24 um 10:18 schrieb Sunil Khatri: Adding gfx10 gc registers to be used for register dump via devcoredump during a gpu reset. Signed-off-by: Sunil Khatri ---   drivers/gpu/drm/amd/amdgpu/amdgpu.h   |   8 ++   drivers/gpu/drm/amd/

Re: [PATCH v5 2/6] drm/amdgpu: add support of gfx10 register dump

2024-04-17 Thread Khatri, Sunil
On 4/17/2024 9:31 PM, Lazar, Lijo wrote: On 4/17/2024 9:21 PM, Alex Deucher wrote: On Wed, Apr 17, 2024 at 5:38 AM Sunil Khatri wrote: Adding gfx10 gc registers to be used for register dump via devcoredump during a gpu reset. Signed-off-by: Sunil Khatri Reviewed-by: Alex Deucher ---

Re: [PATCH v5 2/6] drm/amdgpu: add support of gfx10 register dump

2024-04-17 Thread Khatri, Sunil
adev->gfx.ip_dump[i] = RREG32(SOC15_REG_ENTRY_OFFSET(gc_reg_list_10_1[i])); } amdgpu_gfx_off_ctrl(adev, true); Sunil Alex Thanks, Lijo -Original Message- From: Khatri, Sunil Sent: Wednesday, April 17, 2024 9:42 PM To: Lazar, Lijo ; Alex Deucher ; Khatri, Sunil C

Re: [PATCH] drm/amdgpu: skip ip dump if devcoredump flag is set

2024-04-25 Thread Khatri, Sunil
On 4/25/2024 7:43 PM, Lazar, Lijo wrote: On 4/25/2024 3:53 PM, Sunil Khatri wrote: Do not dump the ip registers during driver reload in passthrough environment. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 ++ 1 file changed, 6 insertions(+), 4 de

Re: [PATCH v1 3/4] drm/amdgpu: add compute registers in ip dump for gfx10

2024-05-03 Thread Khatri, Sunil
On 5/3/2024 8:52 PM, Alex Deucher wrote: On Fri, May 3, 2024 at 4:45 AM Sunil Khatri wrote: add compute registers in set of registers to dump during ip dump for gfx10. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 42 +- 1 file changed,

Re: [PATCH v1 3/4] drm/amdgpu: add compute registers in ip dump for gfx10

2024-05-03 Thread Khatri, Sunil
On 5/3/2024 9:18 PM, Khatri, Sunil wrote: On 5/3/2024 8:52 PM, Alex Deucher wrote: On Fri, May 3, 2024 at 4:45 AM Sunil Khatri wrote: add compute registers in set of registers to dump during ip dump for gfx10. Signed-off-by: Sunil Khatri ---   drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 42

Re: [PATCH v1 3/4] drm/amdgpu: add compute registers in ip dump for gfx10

2024-05-03 Thread Khatri, Sunil
On 5/3/2024 9:52 PM, Alex Deucher wrote: On Fri, May 3, 2024 at 12:09 PM Khatri, Sunil wrote: On 5/3/2024 9:18 PM, Khatri, Sunil wrote: On 5/3/2024 8:52 PM, Alex Deucher wrote: On Fri, May 3, 2024 at 4:45 AM Sunil Khatri wrote: add compute registers in set of registers to dump during ip

Re: [PATCH v3 1/4] drm/amdgpu: update the ip_dump to ipdump_core

2024-05-15 Thread Khatri, Sunil
On 5/16/2024 1:37 AM, Deucher, Alexander wrote: [Public] -Original Message- From: Sunil Khatri Sent: Wednesday, May 15, 2024 8:18 AM To: Deucher, Alexander ; Koenig, Christian Cc: amd-gfx@lists.freedesktop.org; Khatri, Sunil Subject: [PATCH v3 1/4] drm/amdgpu: update the ip_dump

Re: [PATCH v3 3/4] drm/amdgpu: add support to dump gfx10 queue registers

2024-05-15 Thread Khatri, Sunil
On 5/16/2024 1:42 AM, Deucher, Alexander wrote: [Public] -Original Message- From: Sunil Khatri Sent: Wednesday, May 15, 2024 8:18 AM To: Deucher, Alexander ; Koenig, Christian Cc: amd-gfx@lists.freedesktop.org; Khatri, Sunil Subject: [PATCH v3 3/4] drm/amdgpu: add support to dump

Re: [PATCH v3 2/4] drm/amdgpu: Add support to dump gfx10 cp registers

2024-05-15 Thread Khatri, Sunil
On 5/16/2024 1:40 AM, Deucher, Alexander wrote: [Public] -Original Message- From: Sunil Khatri Sent: Wednesday, May 15, 2024 8:18 AM To: Deucher, Alexander ; Koenig, Christian Cc: amd-gfx@lists.freedesktop.org; Khatri, Sunil Subject: [PATCH v3 2/4] drm/amdgpu: Add support to dump

RE: [PATCH v1 1/3] drm/amdgpu: add gfx9 register support in ipdump

2024-05-29 Thread Khatri, Sunil
[AMD Official Use Only - AMD Internal Distribution Only] -Original Message- From: Alex Deucher Sent: Wednesday, May 29, 2024 7:16 PM To: Khatri, Sunil Cc: Deucher, Alexander ; Koenig, Christian ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH v1 1/3] drm/amdgpu: add gfx9 register

RE: [PATCH] drm/amdgpu/gmc11: implement get_vbios_fb_size()

2023-05-12 Thread Khatri, Sunil
[AMD Official Use Only - General] Acked-by: Sunil Khatri -Original Message- From: amd-gfx On Behalf Of Alex Deucher Sent: Thursday, May 11, 2023 8:13 PM To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander Subject: [PATCH] drm/amdgpu/gmc11: implement get_vbios_fb_size() Implement

RE: Help debug amdgpu faults

2022-11-22 Thread Khatri, Sunil
[AMD Official Use Only - General] Hello Alex, Robert I too have similar issues which I am facing on chrome. Are there any tools in linux environment which can help debug such issues like page faults, kernel panic caused by invalid pointer access. I have used tools like ramdump parser which can

RE: [PATCH] drm/amdgpu: enable tmz by default for skyrim

2022-05-30 Thread Khatri, Sunil
[AMD Official Use Only - General] @Ernst Sjöstrand<mailto:ern...@gmail.com> Make sense. Thanks for Review. Pushed another patch without any such names. Regards Sunil khatri From: Ernst Sjöstrand Sent: Tuesday, May 31, 2022 1:47 AM To: Khatri, Sunil Cc: Deucher, Alexander ; amd-gfx m

Re: [PATCH 1/2] drm/amdgpu: drop volatile from ring buffer

2024-10-08 Thread Khatri, Sunil
Reviewed-by: Sunil Khatri On 10/8/2024 11:41 PM, Christian König wrote: Volatile only prevents the compiler from re-ordering reads and writes. Since we always only modify the ring buffer from one CPU thread and have an explicit barrier before signaling the HW this should have no effect at all a

Re: [PATCH 1/2] drm/amdgpu: validate if sw_init is defined or NULL

2024-10-09 Thread Khatri, Sunil
On 10/9/2024 4:19 PM, Christian König wrote: Am 09.10.24 um 10:48 schrieb Sunil Khatri: Before making a function call to sw_init, validate the function pointer. Maybe add " like we do for hw_init." or some similar example of optional callback. Sure will update the commit message also S

Re: [PATCH 2/2] drm/admgpu: clean the dummy sw_init functions

2024-10-09 Thread Khatri, Sunil
On 10/9/2024 4:21 PM, Christian König wrote: Am 09.10.24 um 10:48 schrieb Sunil Khatri: Remove the dummy sw_init functions and set the corresponding functions to NULL. Signed-off-by: Sunil Khatri ---   drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c | 5 -   drivers/gpu/drm/amd/amdgpu/cik.c  

Re: [PATCH 2/2] drm/amdgpu: stop masking the wptr all the time

2024-10-08 Thread Khatri, Sunil
On 10/8/2024 11:41 PM, Christian König wrote: Stop masking the wptr and decrementing the count_dw while writing into the ring buffer. We can do that all at once while pushing the changes to the HW. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 11 +--

Re: [PATCH v2 1/5] drm/amdgpu: update the handle ptr in early_init

2024-09-30 Thread Khatri, Sunil
Sure Christian. Changes related to dummy functions i was planning to remove in a separate patch as there are many empty functions. Also related to using the local variable ip_block also i am planning to take in one separate commit as it impacts all the functions where these changes are being d

Re: [PATCH v2 1/2] drm/amdgpu: move error log from ring write to commit

2024-10-07 Thread Khatri, Sunil
On 10/7/2024 7:18 PM, Christian König wrote: Am 03.10.24 um 10:28 schrieb Sunil Khatri: Move the error message from ring write as an optimization to avoid printing that message on every write instead print once during commit if it exceeds write the allocated size i.e ring->count_dw. Also we d

Re: [PATCH v5 02/12] drm/amdgpu: add helper function amdgpu_ip_block_suspend

2024-10-18 Thread Khatri, Sunil
On 10/18/2024 4:40 PM, Christian König wrote: Am 17.10.24 um 18:25 schrieb Sunil Khatri: Use the helper function amdgpu_ip_block_suspend where same checks and calls are repeated. I strongly suggest to squash this patch and the next one together. Sure. Noted Signed-off-by: Sunil Khatri ---

Re: [PATCH v6 00/12] validate/clean the functions of ip funcs

2024-10-18 Thread Khatri, Sunil
On 10/18/2024 7:08 PM, Christian König wrote: Patches #2, #3 and #12 are Acked-by: Christian König The rest are Reviewed-by: Christian König Maybe give others till Monday to take a look as well, could be that Alex, Lijo or somebody else point out that we are ignoring the suspend return c

Re: [PATCH v6 00/12] validate/clean the functions of ip funcs

2024-10-21 Thread Khatri, Sunil
On 10/21/2024 10:11 PM, Deucher, Alexander wrote: [AMD Official Use Only - AMD Internal Distribution Only] -Original Message- From: Khatri, Sunil Sent: Friday, October 18, 2024 10:07 AM To: Koenig, Christian ; Deucher, Alexander ; Liu, Leo ; Lazar, Lijo Cc: amd-gfx

Re: [PATCH v4 15/15] drm/amdgpu: validate get_clockgating_state before use

2024-10-17 Thread Khatri, Sunil
On 10/17/2024 5:50 PM, Christian König wrote: Am 17.10.24 um 12:06 schrieb Sunil Khatri: Validate the function pointer for get_clockgating_state before making a function call. Oh, I'm not sure if that is necessary or not. The NBIO, HDP and SMUIO functions are not IP specific. For many socs

  1   2   >