[AMD Official Use Only - AMD Internal Distribution Only]
Sorry if I mangled the reply. AMDs mail servers seem to have a hickup this
morning and I have to use OWA.
Von: Matthew Brost
Gesendet: Freitag, 19. Juli 2024 19:39
An: Christian König
Cc: Tvrtko Ur
Fix warning about kiq ring.
Unlock kiq ring when queue reset fails.
[ 285.999224] amdgpu :03:00.0: amdgpu: GPU reset begin!
[ 312.018425] watchdog: BUG: soft lockup - CPU#11 stuck for 26s!
[kworker/u64:2:878]
[ 312.018428] Modules linked in: amdgpu(E) amdxcp drm_exec gpu_sched drm_buddy
d
Thx, but in that case this patch here then won't help either it just
mitigates the problem.
Can you try to reduce num_hw_submission for the MES ring?
Thanks,
Christian.
Am 22.07.24 um 05:27 schrieb Xiao, Jack:
[AMD Official Use Only - AMD Internal Distribution Only]
/>> I think we rather n
This was basically just another one of amdgpus hacks. The parameter
allowed to restart the scheduler without turning fence signaling on
again.
That this is absolutely not a good idea should be obvious by now since
the fences will then just sit there and never signal.
While at it cleanup the code
That's a known issue and we are already working on it.
Regards,
Christian.
Am 20.07.24 um 19:08 schrieb Mikhail Gavrilov:
Hi,
I spotted "MES failed to respond to msg=MISC (WAIT_REG_MEM)" messages
in my kernel log since 6.10-rc5.
After this message, usually follow "[drm:amdgpu_mes_reg_write_reg_
[AMD Official Use Only - AMD Internal Distribution Only]
>> Can you try to reduce num_hw_submission for the MES ring?
Smaller num_hw_submission should not help for this issue, for Mes work without
drm scheduler like legacy kiq. Smaller num_hw_submission will result in smaller
mes ring size and
This commit addresses a potential null pointer dereference issue in the
`dcn30_init_hw` function. The issue could occur when `dc->clk_mgr` or
`dc->clk_mgr->funcs` is null.
The fix adds a check to ensure `dc->clk_mgr` and `dc->clk_mgr->funcs` is
not null before accessing its functions. This prevent
This commit addresses a potential null pointer dereference issue in the
`dcn32_init_hw` function. The issue could occur when `dc->clk_mgr` is
null.
The fix adds a check to ensure `dc->clk_mgr` is not null before
accessing its functions. This prevents a potential null pointer
dereference.
Reported
What I meant is that the MES ring is now to small for the number of
packets written to it.
Either reduce the num_hw_submission or increase the MES ring size.
E.g. see this code here in amdgpu_ring_init:
if (ring->funcs->type == AMDGPU_RING_TYPE_KIQ)
sched_hw_submission
This commit addresses a potential null pointer dereference issue in the
`dcn401_init_hw` function. The issue could occur when `dc->clk_mgr` or
`dc->clk_mgr->funcs` is null.
The fix adds a check to ensure `dc->clk_mgr` and `dc->clk_mgr->funcs` is
not null before accessing its functions. This preven
Hi Matthew,
On 7/19/2024 4:01 PM, Matthew Auld wrote:
On 17/07/2024 16:02, Paneer Selvam, Arunpravin wrote:
On 7/16/2024 3:34 PM, Matthew Auld wrote:
On 16/07/2024 10:50, Paneer Selvam, Arunpravin wrote:
Hi Matthew,
On 7/10/2024 6:20 PM, Matthew Auld wrote:
On 10/07/2024 07:03, Paneer Sel
This commit adds a null check for the set_output_gamma function pointer
in the dcn30_set_output_transfer_func function. Previously,
set_output_gamma was being checked for nullity at line 386, but then it
was being dereferenced without any nullity check at line 401. This
could potentially lead to a
On Tue, Jul 16, 2024 at 10:48:03AM -0400, Hamza Mahfooz wrote:
Hi Sasha,
On 7/16/24 10:24, Sasha Levin wrote:
From: Tom Chung
[ Upstream commit 6b8487cdf9fc7bae707519ac5b5daeca18d1e85b ]
[Why]
Sometimes the new_crtc_state->vrr_infopacket did not sync up with the
current state.
It will affect
Hi Easwar,
merged to drm-intel-next. Thanks!
On Thu, Jul 11, 2024 at 05:27:31AM +, Easwar Hariharan wrote:
> I2C v7, SMBus 3.2, and I3C 1.1.1 specifications have replaced "master/slave"
> with more appropriate terms. Inspired by Wolfram's series to fix drivers/i2c/,
> fix the terminology for
Hi Thomas,
On 7/20/24 9:31 AM, Thomas Weißschuh wrote:
> Hi Hans,
>
> On 2024-07-18 10:25:18+, Hans de Goede wrote:
>> On 6/24/24 6:15 PM, Thomas Weißschuh wrote:
>>> On 2024-06-24 11:11:40+, Hans de Goede wrote:
On 6/23/24 10:51 AM, Thomas Weißschuh wrote:
> The value of "min_in
On 7/21/24 00:20, Basavaraj Natikar wrote:
> On 7/17/2024 4:51 PM, Linux regression tracking (Thorsten Leemhuis) wrote:
>> On 15.07.24 06:39, Chris Hixon wrote:
>>> System: HP ENVY x360 Convertible 15-ds1xxx; AMD Ryzen 7 4700U with
>>> Radeon Graphics
>>>
>>> Problem commits (introduced in v6.9-rc
On Tue, Jul 16, 2024 at 8:58 PM Jim Cromie wrote:
>
> resending to fix double-copies of a dozen patches.
> added 2 squash-ins to address Ville's designated-initializer comment.
>
> This fixes dynamic-debug support for DRM.debug, added via classmaps.
> commit bb2ff6c27bc9 (drm: Disable dynamic debu
On Tue, Jun 18, 2024 at 01:42:56PM +0200, Christian König wrote:
Am 18.06.24 um 11:11 schrieb Pavel Machek:
Hi!
[ Upstream commit a0cf36546cc24ae1c95d72253c7795d4d2fc77aa ]
The pointer parent may be NULLed by the function amdgpu_vm_pt_parent.
To make the code more robust, check the pointer pa
[Public]
Hi all,
This week this patchset was tested on the following systems:
* Lenovo ThinkBook T13s Gen4 with AMD Ryzen 5 6600U
* MSI Gaming X Trio RX 6800
* Gigabyte Gaming OC RX 7900 XTX
These systems were tested on the following display/connection types:
* eD
Fixes the below with gcc W=1:
drivers/gpu/drm/amd/amdgpu/../display/dc/dpp/dcn401/dcn401_dpp_dscl.c:961:
warning: Function parameter or struct member 'bs_coeffs_updated' not described
in 'dpp401_dscl_program_isharp'
Fixes: 431ae65ea4e1 ("drm/amd/display: ensure EASF and ISHARP coefficients are
Am 17.07.24 um 22:37 schrieb Alex Deucher:
It should work the same for compute as well as gfx.
Signed-off-by: Alex Deucher
Reviewed-by: Christian König for the whole
series.
---
drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/a
Am 17.07.24 um 22:38 schrieb Alex Deucher:
Need to handle the interrupt enables for all pipes.
v2: fix indexing (Jessie)
Signed-off-by: Alex Deucher
Acked-by: Christian König for the whole series.
---
drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 130 +
1 file change
Am 17.07.24 um 22:40 schrieb Alex Deucher:
From: Jesse Zhang
For the bad opcode case, it will cause CP/ME hang.
The firmware will prevent the ME side from hanging by raising a bad opcode
interrupt.
And the driver needs to perform a vmid reset when receiving the interrupt.
v2: update irq namin
On Mon, Jul 22, 2024 at 9:55 AM Christian König
wrote:
>
> Am 17.07.24 um 22:40 schrieb Alex Deucher:
> > From: Jesse Zhang
> >
> > For the bad opcode case, it will cause CP/ME hang.
> > The firmware will prevent the ME side from hanging by raising a bad opcode
> > interrupt.
> > And the driver
Am 22.07.24 um 15:52 schrieb Tvrtko Ursulin:
On 19/07/2024 16:18, Christian König wrote:
Am 19.07.24 um 15:02 schrieb Christian König:
Am 19.07.24 um 11:47 schrieb Tvrtko Ursulin:
From: Tvrtko Ursulin
Long time ago in commit b3ac17667f11 ("drm/scheduler: rework entity
creation") a change wa
On 7/22/24 7:15 AM, Srinivasan Shanmugam wrote:
Fixes the below with gcc W=1:
drivers/gpu/drm/amd/amdgpu/../display/dc/dpp/dcn401/dcn401_dpp_dscl.c:961:
warning: Function parameter or struct member 'bs_coeffs_updated' not described
in 'dpp401_dscl_program_isharp'
Fixes: 431ae65ea4e1 ("drm/a
Am 22.07.24 um 16:43 schrieb Tvrtko Ursulin:
On 22/07/2024 15:06, Christian König wrote:
Am 22.07.24 um 15:52 schrieb Tvrtko Ursulin:
On 19/07/2024 16:18, Christian König wrote:
Am 19.07.24 um 15:02 schrieb Christian König:
Am 19.07.24 um 11:47 schrieb Tvrtko Ursulin:
From: Tvrtko Ursulin
Hi Easwar,
On Mon, Jul 22, 2024 at 09:15:08AM -0700, Easwar Hariharan wrote:
> On 7/22/2024 5:50 AM, Andi Shyti wrote:
> > Hi Easwar,
> >
> > merged to drm-intel-next. Thanks!
> >
> > On Thu, Jul 11, 2024 at 05:27:31AM +, Easwar Hariharan wrote:
> >> I2C v7, SMBus 3.2, and I3C 1.1.1 specific
On Jul 21 2024, Chris Hixon wrote:
> On 7/21/24 00:20, Basavaraj Natikar wrote:
>
> > On 7/17/2024 4:51 PM, Linux regression tracking (Thorsten Leemhuis) wrote:
> >> On 15.07.24 06:39, Chris Hixon wrote:
> >>> System: HP ENVY x360 Convertible 15-ds1xxx; AMD Ryzen 7 4700U with
> >>> Radeon Graphics
Reviewed-by: Alex Hung
On 2024-07-22 05:28, Srinivasan Shanmugam wrote:
This commit addresses a potential null pointer dereference issue in the
`dcn401_init_hw` function. The issue could occur when `dc->clk_mgr` or
`dc->clk_mgr->funcs` is null.
The fix adds a check to ensure `dc->clk_mgr` and
The number of watchpoints should be set and constrained per logical
partition device, not by the socket device.
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 20 ++--
drivers/gpu/drm/amd/amdkfd/kfd_device.c | 4 ++--
drivers/gpu/drm/amd/amdkfd/kfd_pri
[Public]
Acked-by: Alex Deucher
From: Zhang, Yifan
Sent: Sunday, July 21, 2024 10:25 PM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Zhang, Yifan
Subject: [PATCH] drm/amdgpu: skip kfd init if GFX is not ready.
avoid kfd init crash in that case.
On Mon, Jul 22, 2024 at 4:16 AM Jesse Zhang wrote:
>
> Fix warning about kiq ring.
> Unlock kiq ring when queue reset fails.
>
> [ 285.999224] amdgpu :03:00.0: amdgpu: GPU reset begin!
> [ 312.018425] watchdog: BUG: soft lockup - CPU#11 stuck for 26s!
> [kworker/u64:2:878]
> [ 312.018428]
Apply KIQ logic to MES.
MES doesn't really use the GPU scheduler. The base
drivers generally use the MES ring directly rather than
submitting IBs. However, amdgpu_sched_hw_submission
(which defaults to 2) limits the number of outstanding
fences to 2. KFD uses the MES for TLB flushes and the
2 f
Does this patch fix it?
https://patchwork.freedesktop.org/patch/605437/
Alex
On Mon, Jul 22, 2024 at 7:21 AM Christian König
wrote:
>
> What I meant is that the MES ring is now to small for the number of packets
> written to it.
>
> Either reduce the num_hw_submission or increase the MES ring s
Reviewed-by: Alex Hung
On 2024-07-22 04:51, Srinivasan Shanmugam wrote:
This commit addresses a potential null pointer dereference issue in the
`dcn30_init_hw` function. The issue could occur when `dc->clk_mgr` or
`dc->clk_mgr->funcs` is null.
The fix adds a check to ensure `dc->clk_mgr` and `
On Mon, Jul 22, 2024 at 4:50 AM Christian König
wrote:
>
> That's a known issue and we are already working on it.
Do either of these patches help?
https://patchwork.freedesktop.org/patch/605437/
https://patchwork.freedesktop.org/patch/605201/
Alex
>
> Regards,
> Christian.
>
> Am 20.07.24 um 19
Reviewed-by: Alex Hung
On 2024-07-22 05:14, Srinivasan Shanmugam wrote:
This commit addresses a potential null pointer dereference issue in the
`dcn32_init_hw` function. The issue could occur when `dc->clk_mgr` is
null.
The fix adds a check to ensure `dc->clk_mgr` is not null before
accessing
[Why]
Page table of compute VM in the VRAM will lost after gpu reset.
VRAM won't be restored since compute VM has no shadows.
[How]
Use higher 32-bit of vm->generation to record a vram_lost_counter.
Reset the VM state machine when vm->genertaion is not equal to
re-generation token.
v2: Check vm->
[AMD Official Use Only - AMD Internal Distribution Only]
> Does this patch fix it?
https://patchwork.freedesktop.org/patch/605437/
No, please do not check in the patch, it will make my fix not working.
Regards,
Jack
From: Alex Deucher
Sent: Tuesday, 23 July 2024
On Tue, Jul 16, 2024 at 01:10:37PM -0400, Alex Deucher wrote:
> From 8aaf8da07a8b542c0a0f4da2601da07beddfdeb0 Mon Sep 17 00:00:00 2001
> From: Alex Deucher
> Date: Tue, 16 Jul 2024 12:49:25 -0400
> Subject: [PATCH] drm/amd/display: fix corruption with high refresh rates on
> DCN 3.0
>
> This rev
Reviewed-by: Tom Chung
On 7/22/2024 7:48 PM, Srinivasan Shanmugam wrote:
This commit adds a null check for the set_output_gamma function pointer
in the dcn30_set_output_transfer_func function. Previously,
set_output_gamma was being checked for nullity at line 386, but then it
was being derefer
42 matches
Mail list logo