Re: [PATCH] drm/amdgpu: Fix size calculation when init onchip memory

2020-10-23 Thread Christian König
Am 23.10.20 um 07:41 schrieb xinhui pan: Size is page count here. Signed-off-by: xinhui pan Ah yes that one again. At some point we really need to clean that up. Patch is Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 4 ++-- 1 file changed, 2 insertions(+),

Re: kgdb_breakpoint() usage leads to kernel panic

2020-10-23 Thread Takashi Iwai
On Thu, 22 Oct 2020 11:03:43 +0200, Takashi Iwai wrote: > > Hi, > > we recently stumbled on a kernel crash in amdgpu [*], and the kernel > messages indicated that it's from kgdb_breakpoint() call in > ASSERT_CRITICAL() macro. > > Since CONFIG_KGDB=y is set on the openSUSE distro kernels, the > b

[PATCH 3/3] drm/amd/display: Clean up debug macros

2020-10-23 Thread Takashi Iwai
This patch simplifies the ASSERT*() and BREAK_TO_DEBUGGER() macros: - Move the dependency check of CONFIG_KGDB into Kconfig - Unify the kgdb_breakpoint() call - Drop the non-existing CONFIG_HAVE_KGDB Also align the behavior of ASSERT() macro in both cases with and without CONFIG_DEBUG_KERNEL_DC.

[PATCH 0/3] drm/amd/display: Fix kernel panic by breakpoint

2020-10-23 Thread Takashi Iwai
Hi, the amdgpu driver's ASSERT_CRITICAL() macro calls the kgdb_breakpoing() even if no debug option is set, and this leads to a kernel panic on distro kernels. The first two patches are the oneliner fixes for those, while the last one is the cleanup of those debug macros. Takashi === Takashi

[PATCH 2/3] drm/amd/display: Don't invoke kgdb_breakpoint() unconditionally

2020-10-23 Thread Takashi Iwai
ASSERT_CRITICAL() invokes kgdb_breakpoint() whenever either CONFIG_KGDB or CONFIG_HAVE_KGDB is set. This, however, may lead to a kernel panic when no kdb stuff is attached, since the kgdb_breakpoint() call issues INT3. It's nothing but a surprise for normal end-users. For avoiding the pitfall, m

[PATCH 1/3] drm/amd/display: Fix kernel panic by dal_gpio_open() error

2020-10-23 Thread Takashi Iwai
Currently both error code paths handled in dal_gpio_open_ex() issues ASSERT_CRITICAL(), and this leads to a kernel panic unnecessarily if CONFIG_KGDB is enabled. Since basically both are non-critical errors and can be recovered, drop those assert calls and use a safer one, BREAK_TO_DEBUGGER(), for

[PATCH] Revert "drm/amdgpu: IP discovery table is not ready yet for VG"

2020-10-23 Thread Xiaomeng Hou
This reverts commit ba502322c9f216552485cea967aeb8adbaf03a02. IP discovery table has been verified on vangogh. Signed-off-by: Xiaomeng Hou --- drivers/gpu/drm/amd/amdgpu/nv.c | 4 1 file changed, 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c

Re: [PATCH] Revert "drm/amdgpu: IP discovery table is not ready yet for VG"

2020-10-23 Thread Huang Rui
On Fri, Oct 23, 2020 at 04:21:15PM +0800, Hou, Xiaomeng (Matthew) wrote: > This reverts commit ba502322c9f216552485cea967aeb8adbaf03a02. I suggest you didn't highlight to revert a patch. Becasue previous commit was good before, because it's the gap that firmware/sbios support. Just modify the sub

[PATCH 07/65] drm/vblank: Annotate with dma-fence signalling section

2020-10-23 Thread Daniel Vetter
This is rather overkill since currently all drivers call this from hardirq (or at least timers). But maybe in the future we're going to have thread irq handlers and what not, doesn't hurt to be prepared. Plus this is an easy start for sprinkling these fence annotations into shared code. Cc: linux-

[PATCH 06/65] drm/vkms: Annotate vblank timer

2020-10-23 Thread Daniel Vetter
This is needed to signal the fences from page flips, annotate it accordingly. We need to annotate entire timer callback since if we get stuck anywhere in there, then the timer stops, and hence fences stop. Just annotating the top part that does the vblank handling isn't enough. Tested-by: Melissa

[PATCH 05/65] drm/atomic-helper: Add dma-fence annotations

2020-10-23 Thread Daniel Vetter
This is a bit disappointing since we need to split the annotations over all the different parts. I was considering just leaking the critical section into the ->atomic_commit_tail callback of each driver. But that would mean we need to pass the fence_cookie into each driver (there's a total of 13 i

[PATCH 08/65] drm/amdgpu: add dma-fence annotations to atomic commit path

2020-10-23 Thread Daniel Vetter
I need a canary in a ttm-based atomic driver to make sure the dma_fence_begin/end_signalling annotations actually work. Cc: linux-me...@vger.kernel.org Cc: linaro-mm-...@lists.linaro.org Cc: linux-r...@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org Cc: intel-...@lists.freedesktop.org Cc: Chris

[PATCH 19/65] drm/amdgpu: s/GFP_KERNEL/GFP_ATOMIC in scheduler code

2020-10-23 Thread Daniel Vetter
My dma-fence lockdep annotations caught an inversion because we allocate memory where we really shouldn't: kmem_cache_alloc+0x2b/0x6d0 amdgpu_fence_emit+0x30/0x330 [amdgpu] amdgpu_ib_schedule+0x306/0x550 [amdgpu] amdgpu_job_run+0x10f/0x260 [amdgpu] drm_sched

[PATCH 17/65] drm/scheduler: use dma-fence annotations in main thread

2020-10-23 Thread Daniel Vetter
If the scheduler rt thread gets stuck on a mutex that we're holding while waiting for gpu workloads to complete, we have a problem. Add dma-fence annotations so that lockdep can check this for us. I've tried to quite carefully review this, and I think it's at the right spot. But obviosly no exper

[PATCH 18/65] drm/amdgpu: use dma-fence annotations in cs_submit()

2020-10-23 Thread Daniel Vetter
This is a bit tricky, since ->notifier_lock is held while calling dma_fence_wait we must ensure that also the read side (i.e. dma_fence_begin_signalling) is on the same side. If we mix this up lockdep complaints, and that's again why we want to have these annotations. A nice side effect of this is

[PATCH 20/65] drm/scheduler: use dma-fence annotations in tdr work

2020-10-23 Thread Daniel Vetter
In the face of unpriviledged userspace being able to submit bogus gpu workloads the kernel needs gpu timeout and reset (tdr) to guarantee that dma_fences actually complete. Annotate this worker to make sure we don't have any accidental locking inversions or other problems lurking. Originally this

[PATCH 22/65] Revert "drm/amdgpu: add fbdev suspend/resume on gpu reset"

2020-10-23 Thread Daniel Vetter
This is one from the department of "maybe play lottery if you hit this, karma compensation might work". Or at least lockdep ftw! This reverts commit 565d1941557756a584ac357d945bc374d5fcd1d0. It's not quite as low-risk as the commit message claims, because this grabs console_lock, which might be h

[PATCH 23/65] drm/i915: Annotate dma_fence_work

2020-10-23 Thread Daniel Vetter
i915 does tons of allocations from this worker, which lockdep catches. Also generic infrastructure like this with big potential for how dma_fence or other cross driver contracts work, really should be reviewed on dri-devel. Implementing custom wheels for everything within the driver is a classic c

[PATCH 21/65] drm/amdgpu: use dma-fence annotations for gpu reset code

2020-10-23 Thread Daniel Vetter
To improve coverage also annotate the gpu reset code itself, since that's called from other places than drm/scheduler (which is already annotated). Annotations nests, so this doesn't break anything, and allows easier testing. Cc: linux-me...@vger.kernel.org Cc: linaro-mm-...@lists.linaro.org Cc: l

[PATCH 1/5] drm/radeon: Stop changing the drm_driver struct

2020-10-23 Thread Daniel Vetter
With only the kms driver left, we can fold this in. This means we need to move the ioctl table, which means one additional ioctl must be defined in headers. Also there's a conflict between the radeon_init macro and the module init function, so rename the module functions to avoid that. Signed-off

Re: [PATCH] drm/amdgpu: always reset asic when going into suspend

2020-10-23 Thread Alex Deucher
On Thu, Jan 16, 2020 at 10:14 AM Alex Deucher wrote: > > On Wed, Jan 15, 2020 at 2:44 AM Daniel Drake wrote: > > > > On Thu, Dec 19, 2019 at 10:08 PM Alex Deucher wrote: > > > I think there may be some AMD specific handling needed in > > > drivers/acpi/sleep.c. My understanding from reading the

[PATCH 1/3] drm/amdgpu: add mode2 reset support for vangogh

2020-10-23 Thread Alex Deucher
GPU reset is handled via SMU similar to previous APUs. Acked-by: Evan Quan Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/

[PATCH 2/3] drm/amdgpu/nv: add mode2 reset handling

2020-10-23 Thread Alex Deucher
Vangogh will use mode2 reset, so plumb it through the nv soc driver. Acked-by: Evan Quan Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/nv.c | 14 -- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdg

[PATCH 3/3] drm/amdgpu: Enable GPU reset for vangogh

2020-10-23 Thread Alex Deucher
Enable GPU reset when we encounter a hang. Acked-by: Evan Quan Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index a7c95b3205

[PATCH v3 03/56] amdgpu: fix a few kernel-doc markup issues

2020-10-23 Thread Mauro Carvalho Chehab
A kernel-doc markup can't be mixed with a random comment, as it causes parsing problems. While here, change an invalid kernel-doc markup into a common comment. Signed-off-by: Mauro Carvalho Chehab --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 +--- 1 file changed, 5 insertions(+), 3 de

[PATCH v3 02/56] drm: amdgpu_dm: fix a typo

2020-10-23 Thread Mauro Carvalho Chehab
dm_comressor_info -> dm_compressor_info The kernel-doc markup is right, but the struct itself and their references contain a typo. Signed-off-by: Mauro Carvalho Chehab --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 +- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 6 +++

[PATCH v3 11/56] drm/amdgpu: fix some kernel-doc markups

2020-10-23 Thread Mauro Carvalho Chehab
Some functions have different names between their prototypes and the kernel-doc markup. Signed-off-by: Mauro Carvalho Chehab --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 2 +- include/uapi/drm/amdgpu_drm.h| 2 +- 3 files

Re: [PATCH v3 03/56] amdgpu: fix a few kernel-doc markup issues

2020-10-23 Thread Christian König
Am 23.10.20 um 18:32 schrieb Mauro Carvalho Chehab: A kernel-doc markup can't be mixed with a random comment, as it causes parsing problems. While here, change an invalid kernel-doc markup into a common comment. Signed-off-by: Mauro Carvalho Chehab Acked-by: Christian König --- drivers/

Re: [PATCH v3 11/56] drm/amdgpu: fix some kernel-doc markups

2020-10-23 Thread Christian König
Am 23.10.20 um 18:32 schrieb Mauro Carvalho Chehab: Some functions have different names between their prototypes and the kernel-doc markup. Signed-off-by: Mauro Carvalho Chehab Acked-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +- drivers/gpu/drm/amd/amdgpu/

[PATCH] drm/amd/display: Fixed panic during seamless boot.

2020-10-23 Thread Aurabindo Pillai
From: David Galiffi [why] get_pixel_clk_frequency_100hz is undefined in clock_source_funcs. [how] set function pointer: ".get_pixel_clk_frequency_100hz = get_pixel_clk_frequency_100hz" Signed-off-by: David Galiffi --- drivers/gpu/drm/amd/display/dc/dce/dce_clock_source.c | 3 ++- 1 file chan

Re: [PATCH v3 03/56] amdgpu: fix a few kernel-doc markup issues

2020-10-23 Thread Alex Deucher
Applied. Thanks! Alex On Fri, Oct 23, 2020 at 12:38 PM Christian König wrote: > > Am 23.10.20 um 18:32 schrieb Mauro Carvalho Chehab: > > A kernel-doc markup can't be mixed with a random comment, > > as it causes parsing problems. > > > > While here, change an invalid kernel-doc markup into > >

Re: [PATCH v3 11/56] drm/amdgpu: fix some kernel-doc markups

2020-10-23 Thread Alex Deucher
Applied. Thanks! Alex On Fri, Oct 23, 2020 at 12:51 PM Christian König wrote: > > Am 23.10.20 um 18:32 schrieb Mauro Carvalho Chehab: > > Some functions have different names between their prototypes > > and the kernel-doc markup. > > > > Signed-off-by: Mauro Carvalho Chehab > > Acked-by: Chris

Re: [PATCH v3 02/56] drm: amdgpu_dm: fix a typo

2020-10-23 Thread Alex Deucher
Applied. Thanks! Alex On Fri, Oct 23, 2020 at 12:33 PM Mauro Carvalho Chehab wrote: > > dm_comressor_info -> dm_compressor_info > > The kernel-doc markup is right, but the struct itself > and their references contain a typo. > > Signed-off-by: Mauro Carvalho Chehab > --- > drivers/gpu

Re: [PATCH] drm/amd/display: Fixed panic during seamless boot.

2020-10-23 Thread Lakha, Bhawanpreet
[AMD Official Use Only - Internal Distribution Only] Reviewed-by: Bhawanpreet Lakha From: Aurabindo Pillai Sent: October 23, 2020 1:09 PM To: amd-gfx@lists.freedesktop.org Cc: Lakha, Bhawanpreet ; Galiffi, David Subject: [PATCH] drm/amd/display: Fixed panic du

Re: [PATCH] drm/amd/display: Fixed panic during seamless boot.

2020-10-23 Thread Aurabindo Pillai
> [AMD Official Use Only - Internal Distribution Only] > > > > > > > > Reviewed-by: Bhawanpreet Lakha Thanks for the review. > > > > > From: Aurabindo Pillai > > Sent: October 23, 2020 1:09 PM > > To: amd-gfx@lists.freedesktop.org > > Cc: Lakha, Bhawanpreet ; Galiffi, David < > da

Re: [PATCH 0/3] drm/amd/display: Fix kernel panic by breakpoint

2020-10-23 Thread Luben Tuikov
On 2020-10-23 03:46, Takashi Iwai wrote: > Hi, > > the amdgpu driver's ASSERT_CRITICAL() macro calls the > kgdb_breakpoing() even if no debug option is set, and this leads to a > kernel panic on distro kernels. The first two patches are the > oneliner fixes for those, while the last one is the cl

Re: [PATCH] drm/amdgpu: Fix size calculation when init onchip memory

2020-10-23 Thread Luben Tuikov
On 2020-10-23 03:12, Christian König wrote: > Am 23.10.20 um 07:41 schrieb xinhui pan: >> Size is page count here. >> >> Signed-off-by: xinhui pan > > Ah yes that one again. At some point we really need to clean that up. > > Patch is Reviewed-by: Christian König > >> --- >> drivers/gpu/drm/a