On Fri, Nov 01, 2024 at 09:32:23AM -0400, Thomas Frohwein wrote:
> >Synopsis:    amdgpu Radeon 780M graphics hang after page fault
> >Category:    system
> >Environment:
>       System      : OpenBSD 7.6
>       Details     : OpenBSD 7.6-current (GENERIC.MP) #393: Sat Oct 26 
> 21:59:25 MDT 2024
>                        
> dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> 
>       Architecture: OpenBSD.amd64
>       Machine     : amd64
> >Description:
>       After a variable amount of time of running a Godot game (Brotato), 
> generally
>       1-5 minutes, the system hangs and xconsole/messages show the following 
> just
>       before the hang:
> 
> gmc_v11_0_process_interrupt *ERROR* [gfxhub] page fault (src_id:0 ring:24 
> vmid:2 pasid:32779, for process godot pid 7378 thread godot pid 387416)
> gmc_v11_0_process_interrupt *ERROR*   in page starting at address 
> 0x0000000000000000 from client 10
> gfxhub_v3_0_print_l2_protection_fault_status *ERROR* 
> GCVM_L2_PROTECTION_FAULT_STATUS:0x00201430
> gfxhub_v3_0_print_l2_protection_fault_status *ERROR*   Faulty UTCL2 client 
> ID: SQC (data) (0xa)
> gfxhub_v3_0_print_l2_protection_fault_status *ERROR*   MORE_FAULTS: 0x0
> gfxhub_v3_0_print_l2_protection_fault_status *ERROR*   WALKER_ERROR: 0x0
> gfxhub_v3_0_print_l2_protection_fault_status *ERROR*   PERMISSION_FAULTS: 0x3
> gfxhub_v3_0_print_l2_protection_fault_status *ERROR*   MAPPING_ERROR: 0x0
> gfxhub_v3_0_print_l2_protection_fault_status *ERROR*   RW: 0x0
> 
> >How-To-Repeat:
>       So far only found with the Godot 3 game Brotato (but repeatedly, after
>       variable time). I've tried hard to reproduce it with 0ad or other Godot
>       games, but so far has only happened with Brotato.
>       I have reproduced it several times with it.
> 
> >Fix:
>       Unknown. I have tried setting AMD_DEBUG=dcc and then AMD_DEBUG=nodcc
>       based on [1,2].
> 
>       [1] https://gitlab.freedesktop.org/drm/amd/-/issues/2496
>       [2] https://gitlab.freedesktop.org/drm/amd/-/issues/2690

AMD people usually flag these as problems with Mesa (without
specifying specific fixes).

> amdgpu0: IP DISCOVERY GC 11.0.1 12 CU rev 0x09

There is newer GC 11.0.1 firmware, does it help?
The linux-firmware commit doesn't specify what changed.

Index: sysutils/firmware/amdgpu/Makefile
===================================================================
RCS file: /cvs/ports/sysutils/firmware/amdgpu/Makefile,v
diff -u -p -r1.30 Makefile
--- sysutils/firmware/amdgpu/Makefile   11 Sep 2024 06:32:58 -0000      1.30
+++ sysutils/firmware/amdgpu/Makefile   2 Nov 2024 11:34:05 -0000
@@ -1,5 +1,5 @@
 FW_DRIVER=     amdgpu
-FW_VER=                20240909
+FW_VER=                20241017
 DISTNAME=      linux-firmware-${FW_VER}
 EXTRACT_SUFX=  .tar.xz
 EXTRACT_FILES= ${DISTNAME}/{LICENSE.\*,\*.bin}
Index: sysutils/firmware/amdgpu/distinfo
===================================================================
RCS file: /cvs/ports/sysutils/firmware/amdgpu/distinfo,v
diff -u -p -r1.27 distinfo
--- sysutils/firmware/amdgpu/distinfo   11 Sep 2024 06:32:58 -0000      1.27
+++ sysutils/firmware/amdgpu/distinfo   2 Nov 2024 11:34:42 -0000
@@ -1,2 +1,2 @@
-SHA256 (firmware/linux-firmware-20240909.tar.xz) = 
lD+9GYg8+OrfieCyJCJUnbBWVXsezTClZABhWXE2lnE=
-SIZE (firmware/linux-firmware-20240909.tar.xz) = 383099276
+SHA256 (firmware/linux-firmware-20241017.tar.xz) = 
omw471qDJy8rmM6L+MoYZahSo97qSc5ajdgEuRQ1EnM=
+SIZE (firmware/linux-firmware-20241017.tar.xz) = 397400292
Index: sysutils/firmware/amdgpu/pkg/PLIST
===================================================================
RCS file: /cvs/ports/sysutils/firmware/amdgpu/pkg/PLIST,v
diff -u -p -r1.19 PLIST
--- sysutils/firmware/amdgpu/pkg/PLIST  1 Jul 2024 06:46:15 -0000       1.19
+++ sysutils/firmware/amdgpu/pkg/PLIST  2 Nov 2024 11:35:10 -0000
@@ -160,6 +160,13 @@ firmware/amdgpu/gc_11_5_1_mes1.bin
 firmware/amdgpu/gc_11_5_1_mes_2.bin
 firmware/amdgpu/gc_11_5_1_pfp.bin
 firmware/amdgpu/gc_11_5_1_rlc.bin
+firmware/amdgpu/gc_11_5_2_imu.bin
+firmware/amdgpu/gc_11_5_2_me.bin
+firmware/amdgpu/gc_11_5_2_mec.bin
+firmware/amdgpu/gc_11_5_2_mes1.bin
+firmware/amdgpu/gc_11_5_2_mes_2.bin
+firmware/amdgpu/gc_11_5_2_pfp.bin
+firmware/amdgpu/gc_11_5_2_rlc.bin
 firmware/amdgpu/gc_9_4_3_mec.bin
 firmware/amdgpu/gc_9_4_3_rlc.bin
 firmware/amdgpu/green_sardine_asd.bin
@@ -393,6 +400,8 @@ firmware/amdgpu/psp_14_0_0_ta.bin
 firmware/amdgpu/psp_14_0_0_toc.bin
 firmware/amdgpu/psp_14_0_1_ta.bin
 firmware/amdgpu/psp_14_0_1_toc.bin
+firmware/amdgpu/psp_14_0_4_ta.bin
+firmware/amdgpu/psp_14_0_4_toc.bin
 firmware/amdgpu/raven2_asd.bin
 firmware/amdgpu/raven2_ce.bin
 firmware/amdgpu/raven2_gpu_info.bin
@@ -438,6 +447,7 @@ firmware/amdgpu/sdma_6_0_2.bin
 firmware/amdgpu/sdma_6_0_3.bin
 firmware/amdgpu/sdma_6_1_0.bin
 firmware/amdgpu/sdma_6_1_1.bin
+firmware/amdgpu/sdma_6_1_2.bin
 firmware/amdgpu/si58_mc.bin
 firmware/amdgpu/sienna_cichlid_ce.bin
 firmware/amdgpu/sienna_cichlid_dmcub.bin
@@ -579,6 +589,7 @@ firmware/amdgpu/verde_smc.bin
 firmware/amdgpu/verde_uvd.bin
 firmware/amdgpu/vpe_6_1_0.bin
 firmware/amdgpu/vpe_6_1_1.bin
+firmware/amdgpu/vpe_6_1_3.bin
 firmware/amdgpu/yellow_carp_asd.bin
 firmware/amdgpu/yellow_carp_ce.bin
 firmware/amdgpu/yellow_carp_dmcub.bin

Reply via email to