Re: [PATCH v2] drm/amdgpu: Protect the validate list with a mutex

2022-06-22 Thread Christian König
The mutex must be added to the bo_list structure, not the parser structure. The parser is only a temporary structure we allocate for the current thread. Regards, Christian. Am 23.06.22 um 06:39 schrieb Luben Tuikov: Protect the parser's validate list with a mutex in order to avoid buffer objec

Re: [PATCH 3/5] drm/amdgpu: Prevent race between late signaled fences and GPU reset.

2022-06-22 Thread Christian König
Am 22.06.22 um 19:31 schrieb Andrey Grodzovsky: Just a ping You need to give me at least some time to look into this. Andrey On 2022-06-21 15:45, Andrey Grodzovsky wrote: On 2022-06-21 03:25, Christian König wrote: Am 21.06.22 um 00:03 schrieb Andrey Grodzovsky: Problem: After we start

Re: [PATCH 5/5] drm/amdgpu: Follow up change to previous drm scheduler change.

2022-06-22 Thread Christian König
Am 22.06.22 um 19:19 schrieb Andrey Grodzovsky: On 2022-06-22 03:17, Christian König wrote: Am 21.06.22 um 22:00 schrieb Andrey Grodzovsky: On 2022-06-21 03:28, Christian König wrote: Am 21.06.22 um 00:03 schrieb Andrey Grodzovsky: Align refcount behaviour for amdgpu_job embedded HW fence w

[PATCH v2] drm/amdgpu: Protect the validate list with a mutex

2022-06-22 Thread Luben Tuikov
Protect the parser's validate list with a mutex in order to avoid buffer object corruption as recorded in the link below. Cc: Christian König Cc: Alex Deucher Cc: Andrey Grodzovsky Cc: Vitaly Prosyak Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2048 Signed-off-by: Luben Tuikov --- d

[PATCH] drm/amdgpu: Protect the validate list with a mutex

2022-06-22 Thread Luben Tuikov
Protect the parser's validate list with a mutex in order to avoid buffer object corruption as recorded in the link below. Cc: Christian König Cc: Alex Deucher Cc: Andrey Grodzovsky Cc: Vitaly Prosyak Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2048 Signed-off-by: Luben Tuikov --- d

Re: [PATCH v5 01/13] mm: add zone device coherent type memory support

2022-06-22 Thread Sierra Guiza, Alejandro (Alex)
On 6/21/2022 11:16 AM, David Hildenbrand wrote: On 21.06.22 18:08, Sierra Guiza, Alejandro (Alex) wrote: On 6/21/2022 7:25 AM, David Hildenbrand wrote: On 21.06.22 13:55, Alistair Popple wrote: David Hildenbrand writes: On 21.06.22 13:25, Felix Kuehling wrote: Am 6/17/22 um 23:19 schrieb

Re: [PATCH v5 01/13] mm: add zone device coherent type memory support

2022-06-22 Thread Sierra Guiza, Alejandro (Alex)
On 6/21/2022 7:16 PM, Alistair Popple wrote: David Hildenbrand writes: On 21.06.22 18:08, Sierra Guiza, Alejandro (Alex) wrote: On 6/21/2022 7:25 AM, David Hildenbrand wrote: On 21.06.22 13:55, Alistair Popple wrote: David Hildenbrand writes: On 21.06.22 13:25, Felix Kuehling wrote: A

Re: [RFC 0/3] drm/amd/display: Introduce KUnit to Display Mode Library

2022-06-22 Thread Rodrigo Siqueira Jordao
Hi, First of all, thanks a lot for exploring the introduction of kunit inside amdgpu. See my inline comments On 2022-06-18 05:08, David Gow wrote: On Sat, Jun 18, 2022 at 4:24 AM Maíra Canal wrote: On 6/17/22 04:55, David Gow wrote: On Fri, Jun 17, 2022 at 6:41 AM Maíra Canal wrote: H

RE: [PATCH v4 2/3] drm/amdkfd: Enable GFX11 usermode queue oversubscription

2022-06-22 Thread Sider, Graham
[AMD Official Use Only - General] >> On 2022-06-22 11:36, Graham Sider wrote: >> Starting with GFX11, MES requires wptr BOs to be GTT allocated/mapped to >> GART for usermode queues in order to support oversubscription. In the >> case that work is submitted to an unmapped queue, MES must have a GA

[pull] amdgpu drm-fixes-5.19

2022-06-22 Thread Alex Deucher
Hi Dave, Daniel, Fixes for 5.19. The following changes since commit a111daf0c53ae91e71fd2bfe7497862d14132e3e: Linux 5.19-rc3 (2022-06-19 15:06:47 -0500) are available in the Git repository at: https://gitlab.freedesktop.org/agd5f/linux.git tags/amd-drm-fixes-5.19-2022-06-22 for you to fe

[linux-next:master] BUILD REGRESSION ac0ba5454ca85162c08dc429fef1999e077ca976

2022-06-22 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master branch HEAD: ac0ba5454ca85162c08dc429fef1999e077ca976 Add linux-next specific files for 20220622 Error/Warning reports: https://lore.kernel.org/linux-mm/202206212029.yr5m7cd3-...@intel.com https

Re: Using generic fbdev helpers breaks hibernation

2022-06-22 Thread Alex Deucher
Thanks Thomas. I think this got me on the right track. Alex On Tue, Jun 21, 2022 at 6:25 AM Thomas Zimmermann wrote: > > Hi > > Am 21.06.22 um 00:02 schrieb Alex Deucher: > > Maybe someone more familiar with the generic drm fbdev helpers can > > help me understand why they don't work with hiber

Re: [PATCH] gpu/drm/radeon: Fix typo in comments

2022-06-22 Thread Alex Deucher
Applied. Thanks! On Wed, Jun 22, 2022 at 10:24 AM Jiang Jian wrote: > > Remove the repeated word 'and' from comments > > Signed-off-by: Jiang Jian > --- > drivers/gpu/drm/radeon/r300_reg.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/radeon/r300_reg.

Re: [PATCH] drm/amdgpu/vcn: fix no previous prototype warning

2022-06-22 Thread Alex Deucher
Reviewed-by: Alex Deucher On Wed, Jun 22, 2022 at 10:26 AM Ruijing Dong wrote: > > Declare 'static', as the function is not indended to be used > outside of this translation unit. > > Fixes: 748483dbc215 ("drm/amdgpu/vcn: add unified queue ib test") > Reported-by: kernel test robot > Signed-off

Re: [PATCH v4 2/3] drm/amdkfd: Enable GFX11 usermode queue oversubscription

2022-06-22 Thread philip yang
On 2022-06-22 11:36, Graham Sider wrote: Starting with GFX11, MES requires wptr BOs to be GTT allocated/mapped to GART for usermode queues in order to support oversubscription. In the case that work is submitted to an unmapped queue, MES must have a GART wptr

Re: [PATCH] drm/fourcc: fix integer type usage in uapi header

2022-06-22 Thread Alex Deucher
Applied. Thanks! Alex On Wed, Jun 22, 2022 at 3:02 AM Simon Ser wrote: > > On Tuesday, June 21st, 2022 at 22:39, Carlos Llamas > wrote: > > > Kernel uapi headers are supposed to use __[us]{8,16,32,64} types defined > > by as opposed to 'uint32_t' and similar. See [1] for the > > relevant dis

Re: [PATCH 3/5] drm/amdgpu: Prevent race between late signaled fences and GPU reset.

2022-06-22 Thread Andrey Grodzovsky
Just a ping Andrey On 2022-06-21 15:45, Andrey Grodzovsky wrote: On 2022-06-21 03:25, Christian König wrote: Am 21.06.22 um 00:03 schrieb Andrey Grodzovsky: Problem: After we start handling timed out jobs we assume there fences won't be signaled but we cannot be sure and sometimes they fire

Re: [PATCH 5/5] drm/amdgpu: Follow up change to previous drm scheduler change.

2022-06-22 Thread Andrey Grodzovsky
On 2022-06-22 03:17, Christian König wrote: Am 21.06.22 um 22:00 schrieb Andrey Grodzovsky: On 2022-06-21 03:28, Christian König wrote: Am 21.06.22 um 00:03 schrieb Andrey Grodzovsky: Align refcount behaviour for amdgpu_job embedded HW fence with classic pointer style HW fences by increasin

[PATCH v4 2/3] drm/amdkfd: Enable GFX11 usermode queue oversubscription

2022-06-22 Thread Graham Sider
Starting with GFX11, MES requires wptr BOs to be GTT allocated/mapped to GART for usermode queues in order to support oversubscription. In the case that work is submitted to an unmapped queue, MES must have a GART wptr address to determine whether the queue should be mapped. This change is accompa

[PATCH v4 3/3] drm/amdgpu: Update mes_v11_api_def.h

2022-06-22 Thread Graham Sider
Update MES API to support oversubscription without aggregated doorbell for usermode queues. v2: Change oversubscription_no_aggregated_en to is_kfd_process (align with MES) Signed-off-by: Graham Sider Acked-by: Felix Kuehling Reviewed-by: Jack Xiao --- drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c

[PATCH v4 1/3] drm/amdgpu: Fetch MES scheduler/KIQ versions

2022-06-22 Thread Graham Sider
Store MES scheduler and MES KIQ version numbers in amdgpu_mes for GFX11. Signed-off-by: Graham Sider Acked-by: Felix Kuehling Reviewed-by: Jack Xiao --- drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 3 +++ drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 12 2 files changed, 15 insertions(+

Re: [PATCH 1/5] drm/amdgpu: Fix possible refcount leak for release of external_hw_fence

2022-06-22 Thread Christian König
Am 22.06.22 um 17:01 schrieb Andrey Grodzovsky: On 2022-06-22 05:00, Christian König wrote: Am 21.06.22 um 21:34 schrieb Andrey Grodzovsky: On 2022-06-21 03:19, Christian König wrote: Am 21.06.22 um 00:02 schrieb Andrey Grodzovsky: Problem: In amdgpu_job_submit_direct - The refcount should

Re: [PATCH 1/5] drm/amdgpu: Fix possible refcount leak for release of external_hw_fence

2022-06-22 Thread Andrey Grodzovsky
On 2022-06-22 05:00, Christian König wrote: Am 21.06.22 um 21:34 schrieb Andrey Grodzovsky: On 2022-06-21 03:19, Christian König wrote: Am 21.06.22 um 00:02 schrieb Andrey Grodzovsky: Problem: In amdgpu_job_submit_direct - The refcount should drop by 2 but it drops only by 1. amdgpu_ib_sch

[PATCH] gpu/drm/radeon: Fix typo in comments

2022-06-22 Thread Jiang Jian
Remove the repeated word 'and' from comments Signed-off-by: Jiang Jian --- drivers/gpu/drm/radeon/r300_reg.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/radeon/r300_reg.h b/drivers/gpu/drm/radeon/r300_reg.h index 60d5413bafa1..9d341cff63ee 100644 --- a/dr

[PATCH] drm/amdgpu/display: reduce stack size in dml32_ModeSupportAndSystemConfigurationFull()

2022-06-22 Thread Alex Deucher
Move more stack variable in to dummy vars structure on the heap. Fixes stack frame size errors: drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c: In function 'dml32_ModeSupportAndSystemConfigurationFull': drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_

[PATCH] drm/amdgpu/vcn: fix no previous prototype warning

2022-06-22 Thread Ruijing Dong
Declare 'static', as the function is not indended to be used outside of this translation unit. Fixes: 748483dbc215 ("drm/amdgpu/vcn: add unified queue ib test") Reported-by: kernel test robot Signed-off-by: Ruijing Dong --- drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 2 +- 1 file changed, 1 inser

[PATCH v3] drm/amdgpu: To flush tlb for MMHUB of RAVEN series

2022-06-22 Thread Ji, Ruili
From: Ruili Ji amdgpu: [mmhub0] no-retry page fault (src_id:0 ring:40 vmid:8 pasid:32769, for process test_basic pid 3305 thread test_basic pid 3305) amdgpu: in page starting at address 0x7ff990003000 from IH client 0x12 (VMC) amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00840051 amdgpu: Faulty U

Re: [PATCH v2] mdkfd: To flush tlb for MMHUB of GFX9 series

2022-06-22 Thread philip yang
On 2022-06-22 02:45, Ji, Ruili wrote: From: Ruili Ji amdgpu: [mmhub0] no-retry page fault (src_id:0 ring:40 vmid:8 pasid:32769, for process test_basic pid 3305 thread test_basic pid 3305) amdgpu: in page starting at address 0x7ff990003000 from IH client 0x12

Re: [PATCH 2/2] drm/amdgpu: use real_vram_size in ttm_vram_fops

2022-06-22 Thread Christian König
Am 22.06.22 um 12:07 schrieb Pierre-Eric Pelloux-Prayer: If amdgpu.vramlimit= is used, amdgpu_gmc_vram_location will update real_vram_size based on this value. mc_vram_size is the real amount of VRAM, initialized in gmc_..._mc_init. Thinking more about it I came to the conclusion that this

Re: [PATCH 1/2] drm/amdgpu: fix amdgpu.vramlimit handling

2022-06-22 Thread Christian König
Am 22.06.22 um 12:07 schrieb Pierre-Eric Pelloux-Prayer: Without this change amdgpu_ttm_training_data_block_init tries to allocate at the end of the real amount of RAM, which then fails like this if amdgpu.vramlimit= is used: [drm:amdgpu_ttm_init [amdgpu]] *ERROR* alloc c2p_bo failed(-12

[PATCH Review 1/1] drm/amdgpu: add missed ras block id

2022-06-22 Thread Stanley . Yang
The VCN and JPEG ras are supported, so add VCN and JPEG ras block id. Signed-off-by: Stanley.Yang --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h | 4 drivers/gpu/drm/amd/amdgpu/ta_ras_if.h | 2 ++ 2 files changed, 6 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h b/driver

[PATCH 2/2] drm/amdgpu: use real_vram_size in ttm_vram_fops

2022-06-22 Thread Pierre-Eric Pelloux-Prayer
If amdgpu.vramlimit= is used, amdgpu_gmc_vram_location will update real_vram_size based on this value. mc_vram_size is the real amount of VRAM, initialized in gmc_..._mc_init. Signed-off-by: Pierre-Eric Pelloux-Prayer --- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 10 +- 1 file change

[PATCH 1/2] drm/amdgpu: fix amdgpu.vramlimit handling

2022-06-22 Thread Pierre-Eric Pelloux-Prayer
Without this change amdgpu_ttm_training_data_block_init tries to allocate at the end of the real amount of RAM, which then fails like this if amdgpu.vramlimit= is used: [drm:amdgpu_ttm_init [amdgpu]] *ERROR* alloc c2p_bo failed(-12)! [drm:amdgpu_device_init.cold [amdgpu]] *ERROR* sw_init

Re: [PATCH 1/5] drm/amdgpu: Fix possible refcount leak for release of external_hw_fence

2022-06-22 Thread Christian König
Am 21.06.22 um 21:34 schrieb Andrey Grodzovsky: On 2022-06-21 03:19, Christian König wrote: Am 21.06.22 um 00:02 schrieb Andrey Grodzovsky: Problem: In amdgpu_job_submit_direct - The refcount should drop by 2 but it drops only by 1. amdgpu_ib_sched->emit -> refcount 1 from first fence init dm

Re: [PATCH 5/5] drm/amdgpu: Follow up change to previous drm scheduler change.

2022-06-22 Thread Christian König
Am 21.06.22 um 22:00 schrieb Andrey Grodzovsky: On 2022-06-21 03:28, Christian König wrote: Am 21.06.22 um 00:03 schrieb Andrey Grodzovsky: Align refcount behaviour for amdgpu_job embedded HW fence with classic pointer style HW fences by increasing refcount each time emit is called so amdgpu c

Re: [PATCH] drm/fourcc: fix integer type usage in uapi header

2022-06-22 Thread Simon Ser
On Tuesday, June 21st, 2022 at 22:39, Carlos Llamas wrote: > Kernel uapi headers are supposed to use __[us]{8,16,32,64} types defined > by as opposed to 'uint32_t' and similar. See [1] for the > relevant discussion about this topic. In this particular case, the usage > of 'uint64_t' escaped head

Re: [PATCH] drm/amd/display: Add missing hard-float compile flags for PPC64 builds

2022-06-22 Thread Guenter Roeck
On Mon, Jun 20, 2022 at 05:51:04PM -0400, Alex Deucher wrote: > On Sat, Jun 18, 2022 at 7:27 PM Guenter Roeck wrote: > > > > ppc:allmodconfig builds fail with the following error. > > > > powerpc64-linux-ld: > > drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o > >

Re: [PATCH v5 01/13] mm: add zone device coherent type memory support

2022-06-22 Thread Alistair Popple
David Hildenbrand writes: > On 21.06.22 18:08, Sierra Guiza, Alejandro (Alex) wrote: >> >> On 6/21/2022 7:25 AM, David Hildenbrand wrote: >>> On 21.06.22 13:55, Alistair Popple wrote: David Hildenbrand writes: > On 21.06.22 13:25, Felix Kuehling wrote: >> Am 6/17/22 um 23:19