[Bug 211807] [drm:drm_dp_mst_dpcd_read] *ERROR* mstb 000000004e6288dd port 3: DPCD read on addr 0x60 for 1 bytes NAKed
https://bugzilla.kernel.org/show_bug.cgi?id=211807 pavol.ha...@gmail.com changed: What|Removed |Added CC||pavol.ha...@gmail.com --- Comment #2 from pavol.ha...@gmail.com --- I am experiencing the aforementioned error as well. [drm:drm_dp_mst_dpcd_read [drm_kms_helper]] *ERROR* mstb f00db6df port 0: DPCD read on addr 0x4b0 for 1 bytes NAKed I have problems connecting my ThinkPad X1 Carbon 7gen to two external DP monitors via a ThinkPad USB-C gen2 docking station. Linux is not officially supported for this dock unfortunately. The problem is that the monitors work, but seemingly randomly they stop working, either I try to go to Ubuntu settings or try to open some program with a keyboard shortcut. The monitors flicker, then die and I have to reboot. Sometimes just reconnecting the cable works, but usually not. Sometimes when I connect my phone to the dock the monitors start working again, but most often not. There is a lot of people having these problems but I have not found any solution. In dmseg only this error appears. I am to the best of my knowledge up to date with everything, kernel, bios, firmware etc. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 213779] Screen stays blank on resume. from hibernate
https://bugzilla.kernel.org/show_bug.cgi?id=213779 alex14...@yahoo.com changed: What|Removed |Added Regression|No |Yes --- Comment #3 from alex14...@yahoo.com --- That patch works. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 213715] failed to change brightness of HDR panel on AMD GREEN_SARDINE through aux
https://bugzilla.kernel.org/show_bug.cgi?id=213715 mario.limoncie...@amd.com changed: What|Removed |Added CC||mario.limoncie...@amd.com --- Comment #2 from mario.limoncie...@amd.com --- I would suggest to work with BOE and CSO to fix the DPCD values in panel firmware to not report aux control if it is not functional. You can see the kernel checks the DPCD to determine whether to use AUX or PWM. https://github.com/torvalds/linux/blob/c010efb7f0bc0c3cb2cd26b000f71d4bd0c427cd/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c#L2433 BTW this behavior has been also been reported in the past, so a heuristic may be introduced at some point in the future for the buggy panel firmware: https://gitlab.freedesktop.org/drm/amd/-/issues/1438 -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 211425] [drm:atom_op_jump] *ERROR* atombios stuck in loop for more than 20secs aborting
https://bugzilla.kernel.org/show_bug.cgi?id=211425 Andreas (icedragon...@web.de) changed: What|Removed |Added Kernel Version|5.12.14 |5.12.11 - 5.13.3 -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 213053] WARNING on dcn30_hwseq.c dcn30_set_hubp_blank, AMD Radeon 6700XT
https://bugzilla.kernel.org/show_bug.cgi?id=213053 heubor...@gmx.de changed: What|Removed |Added CC||heubor...@gmx.de --- Comment #4 from heubor...@gmx.de --- For me, this issue disappeared after updating the kernelto 5.13. I believe, this commit reverts the change that introduced this issue: https://github.com/torvalds/linux/commit/0b7421f0a6a41a8ce60c4dadf6f9e7c62fbd2f1f#diff-80cc88d298a712966f02c4cd7f9eb372b675720a337d0cbe85385ccdfb9c5618 -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 211425] [drm:atom_op_jump] *ERROR* atombios stuck in loop for more than 20secs aborting
https://bugzilla.kernel.org/show_bug.cgi?id=211425 Andreas (icedragon...@web.de) changed: What|Removed |Added Kernel Version|5.12.11 - 5.13.3|5.12.11 - 5.13.4 -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 211425] [drm:atom_op_jump] *ERROR* atombios stuck in loop for more than 20secs aborting
https://bugzilla.kernel.org/show_bug.cgi?id=211425 Andreas (icedragon...@web.de) changed: What|Removed |Added Kernel Version|5.12.11 - 5.13.4|5.13.4 --- Comment #18 from Andreas (icedragon...@web.de) --- Still broken until current 5.13.4 kernel. Once a day the screen does not recovers and I have to reboot the system. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 213823] New: Broken power management for amdgpu
https://bugzilla.kernel.org/show_bug.cgi?id=213823 Bug ID: 213823 Summary: Broken power management for amdgpu Product: Drivers Version: 2.5 Kernel Version: 5.13.4 Hardware: All OS: Linux Tree: Mainline Status: NEW Severity: high Priority: P1 Component: Video(DRI - non Intel) Assignee: drivers_video-...@kernel-bugs.osdl.org Reporter: bruno.n.pag...@gmail.com Regression: No Created attachment 298003 --> https://bugzilla.kernel.org/attachment.cgi?id=298003&action=edit amdgpu dmesg output on 5.12 After upgrading to kernel 5.13.4 (from 5.12.15, on Arch Linux), I’ve realized my AMD dGPU was not powering off anymore resulting in increased power consumption, heat and noise (because of the fan trying to dissipate the heat). I’ve compared kernel dmesg on both kernels, and I’ve found related differences: @@ -1,4 +1,6 @@ [drm] amdgpu kernel modesetting enabled. +amdgpu: CRAT table not found +amdgpu: Virtual CRAT table created for CPU amdgpu: Topology: Add CPU node fb0: switching to amdgpudrmfb from EFI VGA amdgpu :01:00.0: enabling device (0006 -> 0007) @@ -14,7 +16,10 @@ amdgpu :01:00.0: amdgpu: GART: 256M 0x00FF - 0x00FF0FFF [drm] amdgpu: 4096M of VRAM memory ready [drm] amdgpu: 4096M of GTT memory ready. amdgpu: hwmgr_sw_init smu backed is vegam_smu +kfd kfd: amdgpu: Allocated 3969056 bytes on gart +amdgpu: Virtual CRAT table created for GPU amdgpu: Topology: Add dGPU node [0x694f:0x1002] +kfd kfd: amdgpu: added device 1002:694f amdgpu :01:00.0: amdgpu: SE 4, SH per SE 1, CU per SH 6, active_cu_number 20 -amdgpu :01:00.0: amdgpu: Using ATPX for runtime pm -[drm] Initialized amdgpu 3.40.0 20150101 for :01:00.0 on minor 1 +amdgpu :01:00.0: amdgpu: Using BOCO for runtime pm +[drm] Initialized amdgpu 3.41.0 20150101 for :01:00.0 on minor 1 I’ve attached both excerpt matching the diff above. FWIW, this is a Dell Precision 5530 2-in-1 with a Kaby Lake-G CPU, which has an Intel HD 630 iGPU as well as an AMD Polaris 22 MGL XL [Radeon Pro WX Vega M GL] dGPU. Please tell me if there is anything else that I can provide in order to fix this. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 213823] Broken power management for amdgpu
https://bugzilla.kernel.org/show_bug.cgi?id=213823 Bruno Pagani (bruno.n.pag...@gmail.com) changed: What|Removed |Added CC||bruno.n.pag...@gmail.com --- Comment #1 from Bruno Pagani (bruno.n.pag...@gmail.com) --- Created attachment 298005 --> https://bugzilla.kernel.org/attachment.cgi?id=298005&action=edit amdgpu dmesg output on 5.13 -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 213823] Broken power management for amdgpu
https://bugzilla.kernel.org/show_bug.cgi?id=213823 Bruno Pagani (bruno.n.pag...@gmail.com) changed: What|Removed |Added Regression|No |Yes -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 213823] Broken power management for amdgpu
https://bugzilla.kernel.org/show_bug.cgi?id=213823 Alex Deucher (alexdeuc...@gmail.com) changed: What|Removed |Added CC||alexdeuc...@gmail.com --- Comment #2 from Alex Deucher (alexdeuc...@gmail.com) --- Please attach your full dmesg outputs. Can you bisect? -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 213823] Broken power management for amdgpu
https://bugzilla.kernel.org/show_bug.cgi?id=213823 --- Comment #3 from Bruno Pagani (bruno.n.pag...@gmail.com) --- Created attachment 298009 --> https://bugzilla.kernel.org/attachment.cgi?id=298009&action=edit dmesg 5.12 -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 213823] Broken power management for amdgpu
https://bugzilla.kernel.org/show_bug.cgi?id=213823 --- Comment #4 from Bruno Pagani (bruno.n.pag...@gmail.com) --- Created attachment 298011 --> https://bugzilla.kernel.org/attachment.cgi?id=298011&action=edit dmesg 5.13 -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 213823] Broken power management for amdgpu
https://bugzilla.kernel.org/show_bug.cgi?id=213823 --- Comment #5 from Bruno Pagani (bruno.n.pag...@gmail.com) --- (In reply to Alex Deucher from comment #2) > Please attach your full dmesg outputs. Can you bisect? Done. Unfortunately no: I’ve never done so before, so while I expect to be technically able to do it, I guess it will take some time for me to setup (I have never compiled a kernel myself either), and time is something I definitively lack of currently (several deadlines to meet each week until the end of August). Since I can live with a 5.12 kernel (or even 5.10 LTS), I’m fine with it having to wait until I have time to setup bisecting if need be though. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 205089] amdgpu : drm:amdgpu_cs_ioctl : Failed to initialize parser -125
https://bugzilla.kernel.org/show_bug.cgi?id=205089 jes...@jnsn.dev changed: What|Removed |Added CC||jes...@jnsn.dev --- Comment #14 from jes...@jnsn.dev --- I'm now seeing this bug again. This time it happening while launching dota2. Hardware: RX 5700 XT Ryzen 3800X Software: Mesa 21.1.5 (arch mainline) Linux 5.13.4.arch2-1 Log (Notice that it's most recent first): Jul 26 22:15:55 delusionalStation kernel: amdgpu :0a:00.0: amdgpu: GPU reset(2) succeeded! Jul 26 22:15:55 delusionalStation kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125! Jul 26 22:15:55 delusionalStation kernel: [drm] Skip scheduling IBs! ... A bunch of repeats Jul 26 22:15:55 delusionalStation kernel: amdgpu :0a:00.0: amdgpu: recover vram bo from shadow done Jul 26 22:15:55 delusionalStation kernel: amdgpu :0a:00.0: amdgpu: recover vram bo from shadow start Jul 26 22:15:55 delusionalStation kernel: amdgpu :0a:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 1 Jul 26 22:15:55 delusionalStation kernel: amdgpu :0a:00.0: amdgpu: ring vcn_enc1 uses VM inv eng 4 on hub 1 Jul 26 22:15:55 delusionalStation kernel: amdgpu :0a:00.0: amdgpu: ring vcn_enc0 uses VM inv eng 1 on hub 1 Jul 26 22:15:55 delusionalStation kernel: amdgpu :0a:00.0: amdgpu: ring vcn_dec uses VM inv eng 0 on hub 1 Jul 26 22:15:55 delusionalStation kernel: amdgpu :0a:00.0: amdgpu: ring sdma1 uses VM inv eng 13 on hub 0 Jul 26 22:15:55 delusionalStation kernel: amdgpu :0a:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0 Jul 26 22:15:55 delusionalStation kernel: amdgpu :0a:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0 Jul 26 22:15:55 delusionalStation kernel: amdgpu :0a:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0 Jul 26 22:15:55 delusionalStation kernel: amdgpu :0a:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0 Jul 26 22:15:55 delusionalStation kernel: amdgpu :0a:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0 Jul 26 22:15:55 delusionalStation kernel: amdgpu :0a:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0 Jul 26 22:15:55 delusionalStation kernel: amdgpu :0a:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0 Jul 26 22:15:55 delusionalStation kernel: amdgpu :0a:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0 Jul 26 22:15:55 delusionalStation kernel: amdgpu :0a:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0 Jul 26 22:15:55 delusionalStation kernel: amdgpu :0a:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0 Jul 26 22:15:55 delusionalStation kernel: amdgpu :0a:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0 Jul 26 22:15:55 delusionalStation kernel: [drm] JPEG decode initialized successfully. Jul 26 22:15:55 delusionalStation kernel: [drm] VCN decode and encode initialized successfully(under DPG Mode). Jul 26 22:15:55 delusionalStation kernel: [drm] kiq ring mec 2 pipe 1 q 0 Jul 26 22:15:55 delusionalStation kernel: amdgpu :0a:00.0: amdgpu: SMU is resumed successfully! Jul 26 22:15:55 delusionalStation kernel: amdgpu :0a:00.0: amdgpu: SMU is resuming... Jul 26 22:15:55 delusionalStation kernel: amdgpu :0a:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available Jul 26 22:15:55 delusionalStation kernel: amdgpu :0a:00.0: amdgpu: RAP: optional rap ta ucode is not available Jul 26 22:15:55 delusionalStation kernel: amdgpu :0a:00.0: amdgpu: RAS: optional ras ta ucode is not available Jul 26 22:15:55 delusionalStation kernel: [drm] reserve 0x90 from 0x81fe40 for PSP TMR Jul 26 22:15:55 delusionalStation kernel: [drm] PSP is resuming... Jul 26 22:15:55 delusionalStation kernel: [drm] VRAM is lost due to GPU reset! Jul 26 22:15:55 delusionalStation kernel: [drm] PCIE GART of 512M enabled (table at 0x00800030). Jul 26 22:15:55 delusionalStation kernel: amdgpu :0a:00.0: amdgpu: GPU reset succeeded, trying to resume Jul 26 22:15:51 delusionalStation kernel: amdgpu :0a:00.0: amdgpu: BACO reset Jul 26 22:15:51 delusionalStation kernel: [drm] free PSP TMR buffer Jul 26 22:15:51 delusionalStation kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* failed to halt cp gfx Jul 26 22:15:51 delusionalStation kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed Jul 26 22:15:51 delusionalStation kernel: amdgpu :0a:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110) Jul 26 22:15:51 delusionalStation kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KGQ disable failed Jul 26 22:15:51 delusionalStation kernel: amdgpu :0a:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110) Jul 26 22:15:51 delusionalStation kernel: amdgpu :0a:00.0: amdgpu: GPU reset be
[Bug 213779] Screen stays blank on resume from hibernate
https://bugzilla.kernel.org/show_bug.cgi?id=213779 alex14...@yahoo.com changed: What|Removed |Added Summary|Screen stays blank on |Screen stays blank on |resume. from hibernate |resume from hibernate -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 213779] Screen stays blank on resume from hibernate
https://bugzilla.kernel.org/show_bug.cgi?id=213779 --- Comment #4 from Alex Deucher (alexdeuc...@gmail.com) --- Fix is in 5.14 and should land in stable shortly: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6be50f5d83adc9541de3d5be26e968182b5ac150 -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 205089] amdgpu : drm:amdgpu_cs_ioctl : Failed to initialize parser -125
https://bugzilla.kernel.org/show_bug.cgi?id=205089 Alois Nespor (i...@aloisnespor.info) changed: What|Removed |Added CC||i...@aloisnespor.info --- Comment #15 from Alois Nespor (i...@aloisnespor.info) --- i can confirm, have same problem now with Ryzen 5 3400G (RX Vega 11). kernel 5.13.4 and mesa 21.1.5 -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 213373] [drm] [radeon] memory leak at parsing radeon_atombios_parse_power_table
https://bugzilla.kernel.org/show_bug.cgi?id=213373 Erhard F. (erhar...@mailbox.org) changed: What|Removed |Added Attachment #297249|0 |1 is obsolete|| --- Comment #4 from Erhard F. (erhar...@mailbox.org) --- Created attachment 298097 --> https://bugzilla.kernel.org/attachment.cgi?id=298097&action=edit kernel dmesg (5.14-rc3, eMachines E620) output of /sys/kernel/debug/kmemleak on kernel v5.14-rc3: unreferenced object 0x8881098e35a8 (size 96): comm "systemd-udevd", pid 128, jiffies 4294889391 (age 6947.280s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 7d 24 01 81 88 ff ff .}$. 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 backtrace: [] __kmalloc+0x16d/0x1df [] radeon_atombios_parse_power_table_1_3+0x446/0x1b26 [radeon] [] radeon_atombios_get_power_modes+0x149/0x188d [radeon] [] radeon_pm_init+0x1002/0x18d1 [radeon] [] rs690_init+0x763/0x83f [radeon] [] radeon_device_init+0x1c1a/0x21c1 [radeon] [] radeon_driver_load_kms+0x1ef/0x408 [radeon] [] drm_dev_register+0x255/0x4a0 [drm] [] radeon_pci_probe+0x132/0x15e [radeon] [] pci_device_probe+0x1aa/0x294 [] really_probe+0x28f/0x76b [] __driver_probe_device+0x19f/0x1ee [] driver_probe_device+0x44/0xbb [] __driver_attach+0x1a2/0x1d4 [] bus_for_each_dev+0xfa/0x146 [] bus_add_driver+0x2b3/0x447 unreferenced object 0x888101247d00 (size 64): comm "systemd-udevd", pid 128, jiffies 4294889391 (age 6947.280s) hex dump (first 32 bytes): 00 00 00 00 40 9c 00 00 00 00 00 00 00 00 00 00 @... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 backtrace: [] kmem_cache_alloc_trace+0x119/0x169 [] radeon_atombios_parse_power_table_1_3+0x54d/0x1b26 [radeon] [] radeon_atombios_get_power_modes+0x149/0x188d [radeon] [] radeon_pm_init+0x1002/0x18d1 [radeon] [] rs690_init+0x763/0x83f [radeon] [] radeon_device_init+0x1c1a/0x21c1 [radeon] [] radeon_driver_load_kms+0x1ef/0x408 [radeon] [] drm_dev_register+0x255/0x4a0 [drm] [] radeon_pci_probe+0x132/0x15e [radeon] [] pci_device_probe+0x1aa/0x294 [] really_probe+0x28f/0x76b [] __driver_probe_device+0x19f/0x1ee [] driver_probe_device+0x44/0xbb [] __driver_attach+0x1a2/0x1d4 [] bus_for_each_dev+0xfa/0x146 [] bus_add_driver+0x2b3/0x447 -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 210263] brightness device returns ENXIO (?) on brightness restore at boot, with bootoption "quiet"
https://bugzilla.kernel.org/show_bug.cgi?id=210263 Jonas Platte (jplatte+li...@posteo.de) changed: What|Removed |Added CC||jplatte+li...@posteo.de --- Comment #4 from Jonas Platte (jplatte+li...@posteo.de) --- Workaround works for me too, on a Lenovo Ideapad 530S-14ARR with an AMD Ryzen 2500U. I also set acpi_backlight=vendor to fix the same error message for backlight:acpi_video1. (No idea why it tries to restore backlight brightness twice) -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 213917] New: Screen starts flickering when laptop(amdgpu) wakes up after suspend.
https://bugzilla.kernel.org/show_bug.cgi?id=213917 Bug ID: 213917 Summary: Screen starts flickering when laptop(amdgpu) wakes up after suspend. Product: Drivers Version: 2.5 Kernel Version: 5.13.6 Hardware: x86-64 OS: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: Video(DRI - non Intel) Assignee: drivers_video-...@kernel-bugs.osdl.org Reporter: utkarsh.nav...@gmail.com Regression: No I have an ASUS FX505DT with an AMD Ryzen 3550H APU. This is not a new bug and it was fixed in kernel v5.6 with commit hash: eb916a5a93a64c182b0a8f43886aa6bb4c3e52b0 I haven't had the time to test out each of the kernel versions individually, but this bug isn't there in Linux v5.12.6 and only appeared after I updated to v5.13.5. ## Steps to reproduce 1. Let the laptop suspend. 2. Wake it up with some keypresses or something. 3. After waking up, the screen starts flickering on/off every second. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 211425] [drm:atom_op_jump] *ERROR* atombios stuck in loop for more than 20secs aborting
https://bugzilla.kernel.org/show_bug.cgi?id=211425 Andreas (icedragon...@web.de) changed: What|Removed |Added Kernel Version|5.13.4 |5.13.6 -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 213935] New: AMDGPU Renoir crash/freeze while using vaapi with some video types in some apps - drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!
https://bugzilla.kernel.org/show_bug.cgi?id=213935 Bug ID: 213935 Summary: AMDGPU Renoir crash/freeze while using vaapi with some video types in some apps - drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out! Product: Drivers Version: 2.5 Kernel Version: 5.13.6 Hardware: x86-64 OS: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: Video(DRI - non Intel) Assignee: drivers_video-...@kernel-bugs.osdl.org Reporter: plusf...@gmail.com Regression: No Created attachment 298139 --> https://bugzilla.kernel.org/attachment.cgi?id=298139&action=edit dmesg Jul 31 09:50:49 helium kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out! Jul 31 09:50:52 helium kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=18739, emitted seq=18742 Jul 31 09:50:52 helium kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process com.github.rafo pid 4266 thread gjs:cs0 pid 4320 Jul 31 09:50:52 helium kernel: amdgpu :04:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110) Jul 31 09:50:53 helium kernel: amdgpu :04:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110) Jul 31 09:50:53 helium kernel: [drm:amdgpu_gfx_enable_kcq.cold [amdgpu]] *ERROR* KCQ enable failed Jul 31 09:50:53 helium kernel: [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block failed -110 In certain situations I'm able to crash/freeze my system by playing mp4 videos with (vaapi-acceleration) enabled. If the crash/freeze happens, the screen goes black and the system isn't responding to any input. Sadly in this case it's at random if something was able to write output to the log. In most cases there is nothing written about the crash in the log. My environment is GNOME in Wayland mode With native Wayland apps (in this Case Firefox and clapper (https://rafostar.github.io/clapper/) APU is an Ryzen 5 4500U For Firefox I'm not able to reliable recreate that bug. It happens at random while scrolling in twitter and reddit. Never happens in Netflix or Youtube for example. Luckily I was able to recreate it with an app called clapper and a video provided by someone on reddit: https://cdn.discordapp.com/attachments/399812928854949890/870910339548590180/VID_20210731_124021.mp4 Steps: 1. Have GNOME running in Wayland mode and vaapi installed (check with 'vainfo`) 2. Install clapper 3. Download the video 4. Run the video in Clapper 5. While running, launch the video again in clapper It should *not* create another instance of clapper, but try to re-launch the video in the same instance of clapper that was already running. You'll hear maybe a few sec of the audio, but your whole session is frozen and will enter an all black screen without possible recovery a few sec later. I'm able to recreate this with every Kernel I tested. So down to 5.8 -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 205089] amdgpu : drm:amdgpu_cs_ioctl : Failed to initialize parser -125
https://bugzilla.kernel.org/show_bug.cgi?id=205089 mcmar...@gmx.net changed: What|Removed |Added CC||mcmar...@gmx.net --- Comment #16 from mcmar...@gmx.net --- i have the same problem with the kernel 5.11.22-2-MANJARO -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 205089] amdgpu : drm:amdgpu_cs_ioctl : Failed to initialize parser -125
https://bugzilla.kernel.org/show_bug.cgi?id=205089 --- Comment #17 from Alex Deucher (alexdeuc...@gmail.com) --- Does up/downgrading the mesa driver help? -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 213935] AMDGPU Renoir crash/freeze while using vaapi with some video types in some apps - drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!
https://bugzilla.kernel.org/show_bug.cgi?id=213935 Alex Deucher (alexdeuc...@gmail.com) changed: What|Removed |Added CC||alexdeuc...@gmail.com --- Comment #1 from Alex Deucher (alexdeuc...@gmail.com) --- Can you try a newer or older version of mesa? Most likely this is a bug in the user mode driver. The kernel is just the messenger. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 205089] amdgpu : drm:amdgpu_cs_ioctl : Failed to initialize parser -125
https://bugzilla.kernel.org/show_bug.cgi?id=205089 --- Comment #18 from jes...@jnsn.dev --- On 02/08/21 at 02:13pm, bugzilla-dae...@bugzilla.kernel.org wrote: >Does up/downgrading the mesa driver help? Upgrading to the latest git revision of mesa has fixed Dota 2 for me at least. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 205169] AMDGPU driver with Navi card hangs Xorg in fullscreen only.
https://bugzilla.kernel.org/show_bug.cgi?id=205169 --- Comment #27 from aladjev.and...@gmail.com (aladjev.and...@gmail.com) --- Kernel driver hangs in production using regular usage. Such issues should be escalated as much as possible: DCN authors and developers meetings, core developers replacements, driver refactoring/rewrite, tests coverage. But it works in commercial environment only, open source provides TIMEOUT_FOR_FLIP_PENDING. 1.5 years passed: TIMEOUT_FOR_FLIP_PENDING is still here and nobody cares, and i am almost sure that nobody will care about it tomorrow. Thank you. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 212077] AMD GPU discrete card memory at highest frequency even while not in use
https://bugzilla.kernel.org/show_bug.cgi?id=212077 Bat Malin (bat_ma...@abv.bg) changed: What|Removed |Added Status|REOPENED|RESOLVED Resolution|--- |CODE_FIX --- Comment #15 from Bat Malin (bat_ma...@abv.bg) --- Issue fixed in 5.11.12 even now it consumes less power (~1,07W less). Before: amdgpu-pci-0100 Adapter: PCI adapter vddgfx: 756.00 mV edge: +35.0 C (crit = +94.0 C, hyst = -273.1 C) power1:8.14 W (cap = 60.00 W) After: amdgpu-pci-0100 Adapter: PCI adapter vddgfx: 756.00 mV edge: +38.0°C (crit = +94.0°C, hyst = -273.1°C) power1:7.07 W (cap = 60.00 W) Thank you! -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 211875] CPU frequency scaling lost after "WARNING: CPU: 2 PID: 2358578 at smu8_send_msg_to_smc_with_parameter+0xfe/0x140 [amdgpu]"
https://bugzilla.kernel.org/show_bug.cgi?id=211875 --- Comment #8 from Erhard F. (erhar...@mailbox.org) --- Created attachment 296273 --> https://bugzilla.kernel.org/attachment.cgi?id=296273&action=edit dmesg (kernel 4.14.228, A10-9700E) Traced the issue back to kernel v4.14.228 which is still affected (v4.19.184 and v5.4.109 too). On v4.14.228 no stack trace like in recent kernels but these messages: [...] [28541.868617] amdgpu: [powerplay] min_core_set_clock not set [28542.483228] cz_send_msg_to_smc_async (0x0011) failed [28543.097905] cz_send_msg_to_smc_async (0x026e) failed [28543.712424] cz_send_msg_to_smc_async (0x002f) failed [28543.712719] amdgpu: [powerplay] min_core_set_clock not set [28544.330105] cz_send_msg_to_smc_async (0x0011) failed [28544.947054] cz_send_msg_to_smc_async (0x026e) failed [28545.564013] cz_send_msg_to_smc_async (0x002f) failed [28545.564251] amdgpu: [powerplay] min_core_set_clock not set [28546.179695] cz_send_msg_to_smc_async (0x0011) failed [28546.794880] cz_send_msg_to_smc_async (0x026e) failed [28547.409986] cz_send_msg_to_smc_async (0x002f) failed Apart from that the machine behaves the same after these "cz_send_msg_to_smc_async (0x002f) failed" - CPU permanently downclocked to 800 MHz, desktop 'freezing' issues with display going black after some time. Access via ssh still works. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 211425] [drm:atom_op_jump] *ERROR* atombios stuck in loop for more than 20secs aborting
https://bugzilla.kernel.org/show_bug.cgi?id=211425 Andreas (icedragon...@web.de) changed: What|Removed |Added Kernel Version|5.11.11 |5.11.12 --- Comment #16 from Andreas (icedragon...@web.de) --- With 5.11.12 kernel (still affected) there is a small new message line at the end of the other error messages: ... [Do Apr 8 11:13:05 2021] [drm:dcn10_link_encoder_enable_dp_output] *ERROR* dcn10_link_encoder_enable_dp_output: Failed to execute VBIOS command table! [Do Apr 8 11:13:07 2021] [drm] amdgpu_dm_irq_schedule_work FAILED src 2 [Do Apr 8 11:13:27 2021] [drm:atom_op_jump] *ERROR* atombios stuck in loop for more than 20secs aborting [Do Apr 8 11:13:27 2021] [drm:amdgpu_atom_execute_table_locked] *ERROR* atombios stuck executing B228 (len 3608, WS 8, PS 0) @ 0xB712 [Do Apr 8 11:13:27 2021] [drm:amdgpu_atom_execute_table_locked] *ERROR* atombios stuck executing B11C (len 268, WS 4, PS 0) @ 0xB16F [Do Apr 8 11:13:27 2021] [drm:dcn10_link_encoder_enable_dp_output] *ERROR* dcn10_link_encoder_enable_dp_output: Failed to execute VBIOS command table! [Do Apr 8 11:13:29 2021] [drm:dc_link_detect_helper] *ERROR* No EDID read. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 212077] AMD GPU discrete card memory at highest frequency even while not in use
https://bugzilla.kernel.org/show_bug.cgi?id=212077 --- Comment #16 from Bat Malin (bat_ma...@abv.bg) --- After reboot even better - amdgpu-pci-0100 Adapter: PCI adapter vddgfx: 756.00 mV edge: +35.0°C (crit = +94.0°C, hyst = -273.1°C) power1:6.22 W (cap = 60.00 W) -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 211875] CPU frequency scaling lost after "WARNING: CPU: 2 PID: 2358578 at smu8_send_msg_to_smc_with_parameter+0xfe/0x140 [amdgpu]"
https://bugzilla.kernel.org/show_bug.cgi?id=211875 --- Comment #9 from Erhard F. (erhar...@mailbox.org) --- (In reply to Alex Deucher from comment #4) > If this is a regression, can you bisect? > https://www.kernel.org/doc/html/latest/admin-guide/bug-bisect.html Sorry but my bisecting efforts came to a halt. Last kernel I was able to boot was 4.14.228. And this one still has the issue. Kernels 4.13.16, 4.12.14, 4.11.12, 4.10.17, 4.9.264 I was able to build but they don't boot into desktop or even console. Just getting a "no signal" message von my monitor after the kernel has booted, and some of these kernels do a reboot. Don't know how to proceed from here. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 212635] New: nouveau 0000:04:00.0: fifo: fault 00 [READ] at 0000000000380000 engine 00 [GR] client 14 [HUB/SCC] reason 02 [PTE] on channel 5 [007fabf000 X[570]]
https://bugzilla.kernel.org/show_bug.cgi?id=212635 Bug ID: 212635 Summary: nouveau :04:00.0: fifo: fault 00 [READ] at 0038 engine 00 [GR] client 14 [HUB/SCC] reason 02 [PTE] on channel 5 [007fabf000 X[570]] Product: Drivers Version: 2.5 Kernel Version: 5.11.12 Hardware: x86-64 OS: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: Video(DRI - non Intel) Assignee: drivers_video-...@kernel-bugs.osdl.org Reporter: erhar...@mailbox.org Regression: No Created attachment 296315 --> https://bugzilla.kernel.org/attachment.cgi?id=296315&action=edit dmesg (kernel 5.11.12) Happened during browsing on Firefox, WebGL conformance tests were running in the background (https://www.khronos.org/registry/webgl/sdk/tests/webgl-conformance-tests.html?version=2.0.1). The screen went blank, but I got to desktop back again and the machine stayed usable. [...] nouveau :04:00.0: fifo: fault 00 [READ] at 0038 engine 00 [GR] client 14 [HUB/SCC] reason 02 [PTE] on channel 5 [007fabf000 X[570]] nouveau :04:00.0: fifo: channel 5: killed nouveau :04:00.0: fifo: runlist 0: scheduled for recovery nouveau :04:00.0: fifo: engine 0: scheduled for recovery nouveau :04:00.0: X[570]: channel 5 killed! nouveau :04:00.0: fifo: fault 00 [READ] at 0038 engine 00 [GR] client 14 [HUB/SCC] reason 02 [PTE] on channel 5 [007fabf000 X[570]] nouveau :04:00.0: fifo: channel 5: killed nouveau :04:00.0: fifo: runlist 0: scheduled for recovery nouveau :04:00.0: fifo: engine 0: scheduled for recovery nouveau :04:00.0: X[570]: channel 5 killed! nouveau :04:00.0: fifo: fault 00 [READ] at 0038 engine 00 [GR] client 14 [HUB/SCC] reason 02 [PTE] on channel 5 [007fabf000 X[570]] nouveau :04:00.0: fifo: channel 5: killed nouveau :04:00.0: fifo: runlist 0: scheduled for recovery nouveau :04:00.0: fifo: engine 0: scheduled for recovery nouveau :04:00.0: X[570]: channel 5 killed! nouveau :04:00.0: fifo: fault 00 [READ] at 0038 engine 00 [GR] client 14 [HUB/SCC] reason 02 [PTE] on channel 5 [007fabf000 X[570]] nouveau :04:00.0: fifo: channel 5: killed nouveau :04:00.0: fifo: runlist 0: scheduled for recovery nouveau :04:00.0: fifo: engine 0: scheduled for recovery nouveau :04:00.0: X[570]: channel 5 killed! # inxi -b System:Kernel: 5.11.12-gentoo-Excavator x86_64 bits: 64 Desktop: MATE 1.24.0 Distro: Gentoo Base System release 2.7 Machine: Type: Desktop Mobo: ASRock model: A320M-HDV R3.0 serial: M80-BA024200938 UEFI: American Megatrends v: P3.10 date: 06/26/2019 CPU: Info: Quad Core AMD A10-9700E RADEON R7 10 COMPUTE CORES 4C+6G [MCP] speed: 3300 MHz min/max: 800/3000 MHz Graphics: Device-1: NVIDIA GK208B [GeForce GT 710] driver: nouveau v: kernel Display: x11 server: X.Org 1.20.10 driver: modesetting resolution: 1920x1080~60Hz OpenGL: renderer: NV106 v: 4.3 Mesa 20.3.5 Network: Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet driver: r8169 # lspci 00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models 60h-6fh) Processor Root Complex 00:00.2 IOMMU: Advanced Micro Devices, Inc. [AMD] Family 15h (Models 60h-6fh) I/O Memory Management Unit 00:02.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models 60h-6fh) Host Bridge 00:02.4 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models 60h-6fh) Processor Root Port 00:02.5 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models 60h-6fh) Processor Root Port 00:03.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models 60h-6fh) Host Bridge 00:03.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models 60h-6fh) Processor Root Port 00:08.0 Encryption controller: Advanced Micro Devices, Inc. [AMD] Carrizo Platform Security Processor 00:09.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Carrizo Audio Dummy Host Bridge 00:09.2 Audio device: Advanced Micro Devices, Inc. [AMD] Family 15h (Models 60h-6fh) Audio Controller 00:10.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB XHCI Controller (rev 20) 00:11.0 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 49) 00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB EHCI Controller (rev 49) 00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller (rev 4a) 00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge (rev 11) 00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models 60h-6fh) Processor Function 0 00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models 60h-6fh) Processor Function 1 00:18.2 Host bridge: Advanced Micro Devices, I
[Bug 212635] nouveau 0000:04:00.0: fifo: fault 00 [READ] at 0000000000380000 engine 00 [GR] client 14 [HUB/SCC] reason 02 [PTE] on channel 5 [007fabf000 X[570]]
https://bugzilla.kernel.org/show_bug.cgi?id=212635 --- Comment #1 from Erhard F. (erhar...@mailbox.org) --- Created attachment 296317 --> https://bugzilla.kernel.org/attachment.cgi?id=296317&action=edit kernel .config (kernel 5.11.12) -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 212649] general protection fault, probably for non-canonical address 0x1856385d1408f284: 0000 [#1] SMP NOPTI, RIP: 0010:kmem_cache_alloc_trace+0xe9/0x2f0
https://bugzilla.kernel.org/show_bug.cgi?id=212649 Erhard F. (erhar...@mailbox.org) changed: What|Removed |Added CC||airl...@linux.ie, ||dri-devel@lists.freedesktop ||.org -- You may reply to this email to add a comment. You are receiving this mail because: You are on the CC list for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 212655] New: AMDGPU crashes when resuming from suspend when amd_iommu=on
https://bugzilla.kernel.org/show_bug.cgi?id=212655 Bug ID: 212655 Summary: AMDGPU crashes when resuming from suspend when amd_iommu=on Product: Drivers Version: 2.5 Kernel Version: 5.11.10-1 Hardware: x86-64 OS: Linux Tree: Mainline Status: NEW Severity: high Priority: P1 Component: Video(DRI - non Intel) Assignee: drivers_video-...@kernel-bugs.osdl.org Reporter: fjfcavalca...@gmail.com Regression: No So, my setup is the following: Manjaro Linux on kernel 5.11.10, but also tested on pop OS and it also happens. Mb MSI Tomahawk B450 Ryzen 5 3600 GPU Radeon RX5700 (Powercolor Red Devil) I tried multiple kernels from 5.9 to 5.12 and all had the same issue, if I turn on iommu AMDGPU crashes during resume, and I have to hard-reset the system (I cant reset it using shutdown -r for example) What I see in DMESG after resume is the following: [ 36.492418] amdgpu :28:00.0: amdgpu: failed send message: RunBtc (58)param: 0x response 0xffc2 [ 36.492420] amdgpu :28:00.0: amdgpu: RunBtc failed! [ 36.492421] amdgpu :28:00.0: amdgpu: Failed to setup smc hw! [ 36.492422] [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block failed -62 [ 36.492515] amdgpu :28:00.0: amdgpu: amdgpu_device_ip_resume failed (-62). [ 36.492516] PM: dpm_run_callback(): pci_pm_resume+0x0/0xe0 returns -62 [ 36.492520] PM: Device :28:00.0 failed to resume async: error -62 -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 209345] [nouveau] unknown chipset (0f22d0a1) (nVidia Tesla K80)
https://bugzilla.kernel.org/show_bug.cgi?id=209345 --- Comment #12 from Alexander von Gluck (kallis...@unixzen.com) --- A new motherboard later.. and after enabling 64-bit PCIe stuff the card posts. ArchLinux 5.11.13 [4.689213] nouveau :0d:00.0: enabling device ( -> 0002) [4.689343] nouveau :0d:00.0: unknown chipset (0f22d0a1) [4.690686] nouveau :0e:00.0: enabling device ( -> 0002) [4.690758] nouveau :0e:00.0: unknown chipset (0f22d0a1) 0d:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1) Subsystem: NVIDIA Corporation Device 106c Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- Capabilities: [420 v2] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+ AERCap: First Error Pointer: 00, ECRCGenCap- ECRCGenEn- ECRCChkCap- ECRCChkEn- MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap- HeaderLog: Capabilities: [600 v1] Vendor Specific Information: ID=0001 Rev=1 Len=024 Capabilities: [900 v1] Secondary PCI Express LnkCtl3: LnkEquIntrruptEn- PerformEqu- LaneErrStat: 0 Kernel modules: nouveau 0e:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1) Subsystem: NVIDIA Corporation Device 106c Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- Capabilities: [420 v2] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+ AERCap: First Error Pointer: 00, ECRCGenCap- ECRCGenEn- ECRCChkCap- ECRCChkEn- MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap- HeaderLog: Capabilities: [600 v1] Vendor Specific Information: ID=0001 Rev=1 Len=024 Capabilities: [900 v1] Secondary PCI Express LnkCtl3: LnkEquIntrruptEn- PerformEqu- LaneErrStat: 0 Kernel modules: nouveau -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 209345] [nouveau] unknown chipset (0f22d0a1) (nVidia Tesla K80)
https://bugzilla.kernel.org/show_bug.cgi?id=209345 --- Comment #13 from Ilia Mirkin (imir...@alum.mit.edu) --- See comment #3 - it explains what you need to copy in nouveau to try to load it. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 209345] [nouveau] unknown chipset (0f22d0a1) (nVidia Tesla K80)
https://bugzilla.kernel.org/show_bug.cgi?id=209345 --- Comment #14 from Ilia Mirkin (imir...@alum.mit.edu) --- Also, wow, BAR1 = 16GB?? Normally it's like 256MB. No wonder your TB setup had issues. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 209345] [nouveau] unknown chipset (0f22d0a1) (nVidia Tesla K80)
https://bugzilla.kernel.org/show_bug.cgi?id=209345 --- Comment #15 from Alexander von Gluck (kallis...@unixzen.com) --- Applied my patch above to ArchLinux (5.11.13-arch1-1) and gave it a whirl. Got a little information from nouveou before the system hard locks up. nouveau :0d:00.0: enabling device ( -> 0002) nouveau :0d:00.0: NVIDIA GK120 (0f22d0a1) nouveau :0d:00.0: bios: version 80.21.1f.00.01 nouveau :0d:00.0: fb: 11520 MiB GDDR5 (hard crash) I might get more information from serial... however, ran into an unrelated issue. Cooling! The Tesla K80 got up to 175F+ at idle and I had to shut things down. Need to rig some better cooling solution. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 13170] Macbook 5,2 only boots with acpi=off, or nosmp, or maxcpus=1
https://bugzilla.kernel.org/show_bug.cgi?id=13170 --- Comment #73 from morten vermund (mortenverm...@gmail.com) --- (In reply to Alex Murray from comment #40) > It may not be relevant, but apparently the same non-bootable issue has been > seen on the MacBook Air 2,1 - which was solved (ie. without resorting to > maxcpus=1 or noacpi) with the following options: > > acpi=noirq pnpacpi=off > > (originally from here: > https://web.bricksite.net/u77w/realmoneycasinos/online-casinos.html) https://web.bricksite.net/u77w/realmoneycasinos/ -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 13170] Macbook 5,2 only boots with acpi=off, or nosmp, or maxcpus=1
https://bugzilla.kernel.org/show_bug.cgi?id=13170 --- Comment #74 from morten vermund (mortenverm...@gmail.com) --- (In reply to morten vermund from comment #71) > (In reply to dentament from comment #64) > > Hi, > > I boot with 2 cpus, acpi and everything working on ubuntu 10.04 using grub > > 1.99 compiled following these instructions: > > https://webonlinegambling.com/ > > and with this grub.cfg: > > > > - > > insmod efi_gop > > menuentry "Ubuntu Linux Lucid 2.6.32-33-generic" { > > set root='(hd0,3)' > > linux /boot/vmlinuz-2.6.32-33-generic root=/dev/sda3 video=efifb ro > > splash > > initrd /boot/initrd.img-2.6.32-33-generic > > } > > -https://njonlinecasinos.webgarden.com/ > > > > But there's still a problem with this method (tried with various versions > of > > grub-efi): although it may seem the cpus do frequency switching allright, > > they actually always work at their minimum speed. See: > > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/669865 -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 13170] Macbook 5,2 only boots with acpi=off, or nosmp, or maxcpus=1
https://bugzilla.kernel.org/show_bug.cgi?id=13170 --- Comment #75 from morten vermund (mortenverm...@gmail.com) --- (In reply to danny.piccirillo from comment #34) > I ran into this installing Ubuntu on a friend's machine (the newer MacBook > 5,2 http://en.wikipedia.org/wiki/MacBook#Model_specifications released May > 27, 2009). > > Potentially helpful information: https://playlegalsportsbetting.com/ > > Someone found that /proc/cpuinfo reports the CPU as being single core, even > though it is actually dual core. From > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/341230 > > When booting without any of the options as a workaround for this bug, the > screen goes blank and for a brief moment i see "[5.162415] Not responding" > before the screen goes blank once again. I uploaded a video of this to > http://www.archive.org/details/FuckingMacs but it is temporary and will > disappear after a few days. Anybody know a better place to host? > > How can i help? What other info is needed for this bug? -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 13170] Macbook 5,2 only boots with acpi=off, or nosmp, or maxcpus=1
https://bugzilla.kernel.org/show_bug.cgi?id=13170 --- Comment #76 from morten vermund (mortenverm...@gmail.com) --- (In reply to danny.piccirillo from comment #34) > I ran into this installing Ubuntu on a friend's machine (the newer MacBook > 5,2 http://en.wikipedia.org/wiki/MacBook#Model_specifications released May > 27, 2009). > > Potentially helpful information: > > Someone found that /proc/cpuinfo reports the CPU as being single core, even > though it is actually dual core. From > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/341230 > > When booting without any of the options as a workaround for this bug, the > screen goes blank and for a brief moment i see "[5.162415] Not responding" > before the screen goes blank once again. I uploaded a video of this to > https://casinowatchnj.com/tropicana-online-casino/ but it is temporary and > will > disappear after a few days. Anybody know a better place to host? > > How can i help? What other info is needed for this bug? -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 212469] plymouth animation freezes during shutdown
https://bugzilla.kernel.org/show_bug.cgi?id=212469 Arnas (arnasz...@gmail.com) changed: What|Removed |Added CC||arnasz...@gmail.com --- Comment #3 from Arnas (arnasz...@gmail.com) --- I also have this same issue on my system. When I hit shutdown, the aniamtioj plays, then screen goes black, then turns back on and shows frozen animation, then finally shuts down. Here is my system - Kernel - 5.11.14.arch1-1 CPU - Intel Core i5-1035G1 GPU - Intel UHD G1 Graphics, modesetting driver DE - KDE Plasma 5.21.4 Plymouth - plymouth AUR package (non-git) I can also confirm that the issue is only present from 5.11.x. The issue does not exist with kernel 5.10.16. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 212369] AMDGPU: GPU hangs with '*ERROR* Couldn't update BO_VA (-12)' on MIPS64
https://bugzilla.kernel.org/show_bug.cgi?id=212369 Xi Ruoyao (xry...@mengyan1223.wang) changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |CODE_FIX --- Comment #3 from Xi Ruoyao (xry...@mengyan1223.wang) --- Fixed at 566c6e25f957ebdb0b6e8073ee291049118f47fb. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 212729] New: amdgpu: WARN_ON drivers/gpu/drm/amd/amdgpu/../display/dc/dc_helper.c:119 set_reg_field_values.constprop.0+0xbe/0xe0
https://bugzilla.kernel.org/show_bug.cgi?id=212729 Bug ID: 212729 Summary: amdgpu: WARN_ON drivers/gpu/drm/amd/amdgpu/../display/dc/dc_helper.c:1 19 set_reg_field_values.constprop.0+0xbe/0xe0 Product: Drivers Version: 2.5 Kernel Version: 5.11.14-200.fc33.x86_64 Hardware: All OS: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: Video(DRI - non Intel) Assignee: drivers_video-...@kernel-bugs.osdl.org Reporter: b...@kernel.crashing.org Regression: No Created attachment 296437 --> https://bugzilla.kernel.org/attachment.cgi?id=296437&action=edit Full kernel log On Fedora 33, I recently started getting these (I haven't had a chance to bisect). Note the error that happens shortly before the WARN_ON as it might be relevant. This seem to go along with the screen occasionally not coming back from blanking when the machine is left idle. Sometimes unplugging/replugging the DP connector works, sometimes it doesn't. I *think* it's related but I'm not 100% certain. The GPU is a 6800XT. [4.361992] [drm] Initialized amdgpu 3.40.0 20150101 for :0d:00.0 on minor 0 [4.668196] [drm] REG_WAIT timeout 1us * 10 tries - mpc2_assert_idle_mpcc line:480 [4.800700] EXT4-fs (dm-0): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none. [4.844273] [ cut here ] [4.844274] WARNING: CPU: 7 PID: 504 at drivers/gpu/drm/amd/amdgpu/../display/dc/dc_helper.c:119 set_reg_field_values.constprop.0+0xbe/0xe0 [amdgpu] [4.844400] Modules linked in: amdgpu drm_ttm_helper ttm iommu_v2 gpu_sched drm_kms_helper cec crct10dif_pclmul crc32_pclmul crc32c_intel drm ghash_clmulni_intel igb ccp nvme r8169 nvme_core dca i2c_algo_bit wmi pinctrl_amd fuse [4.844410] CPU: 7 PID: 504 Comm: plymouthd Not tainted 5.11.14-200.fc33.x86_64 #1 [4.844412] Hardware name: System manufacturer System Product Name/ROG STRIX X570-E GAMING, BIOS 3603 03/19/2021 [4.844413] RIP: 0010:set_reg_field_values.constprop.0+0xbe/0xe0 [amdgpu] [4.844525] Code: 50 08 49 89 51 08 8b 08 48 8d 42 08 49 89 41 08 44 8b 02 48 8d 50 08 0f b6 c9 49 89 51 08 8b 00 45 85 c0 75 b5 0f 0b eb b1 c3 <0f> 0b e9 4d ff ff ff 49 8b 51 08 eb d1 49 8b 41 08 eb d6 66 66 2e [4.844526] RSP: 0018:c0af00d377a0 EFLAGS: 00010246 [4.844528] RAX: RBX: a078d4de RCX: [4.844528] RDX: RSI: 0001 RDI: c0af00d377a8 [4.844529] RBP: c0af00d37820 R08: 0001 R09: c0af00d377b0 [4.844530] R10: R11: 39e8 R12: 0005 [4.844531] R13: a078d80e1740 R14: 3ae0 R15: a078d9bc01e8 [4.844532] FS: 7fbf4f575f40() GS:a07fcebc() knlGS: [4.844533] CS: 0010 DS: ES: CR0: 80050033 [4.844534] CR2: 7fd50a00d000 CR3: 000110d6c000 CR4: 00350ee0 [4.844535] Call Trace: [4.844537] generic_reg_update_ex+0x5a/0x1c0 [amdgpu] [4.844646] ? dcn20_enable_plane+0x77/0x1e0 [amdgpu] [4.844769] dcn20_program_front_end_for_ctx+0x997/0xb20 [amdgpu] [4.844888] ? optc3_lock+0x9d/0xb0 [amdgpu] [4.845004] dc_commit_state+0x49a/0xa30 [amdgpu] [4.845117] ? drm_calc_timestamping_constants+0x195/0x1f0 [drm] [4.845135] amdgpu_dm_atomic_commit_tail+0x585/0x2600 [amdgpu] [4.845253] ? amdgpu_vm_bo_invalidate+0x83/0x1a0 [amdgpu] [4.845345] ? amdgpu_bo_move_notify+0x41/0xe0 [amdgpu] [4.845434] ? amdgpu_bo_move+0x2d1/0x6d0 [amdgpu] [4.845522] ? ttm_bo_handle_move_mem+0x90/0x180 [ttm] [4.845526] ? ttm_bo_validate+0x11b/0x150 [ttm] [4.845529] ? dm_plane_helper_prepare_fb+0x18c/0x220 [amdgpu] [4.845644] ? _cond_resched+0x16/0x40 [4.845647] ? _cond_resched+0x16/0x40 [4.845648] ? __wait_for_common+0x2b/0x140 [4.845650] commit_tail+0x94/0x130 [drm_kms_helper] [4.845661] drm_atomic_helper_commit+0x113/0x140 [drm_kms_helper] [4.845669] drm_atomic_helper_set_config+0x70/0xb0 [drm_kms_helper] [4.845678] drm_mode_setcrtc+0x1d3/0x6f0 [drm] [4.845695] ? avc_has_extended_perms+0x18d/0x3e0 [4.845698] ? drm_mode_getcrtc+0x180/0x180 [drm] [4.845712] drm_ioctl_kernel+0x86/0xd0 [drm] [4.845730] drm_ioctl+0x20f/0x3a0 [drm] [4.845745] ? drm_mode_getcrtc+0x180/0x180 [drm] [4.845760] amdgpu_drm_ioctl+0x49/0x80 [amdgpu] [4.845848] __x64_sys_ioctl+0x83/0xb0 [4.845850] do_syscall_64+0x33/0x40 [4.845853] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [4.845855] RIP: 0033:0x7fbf4f8315db [4.845856] Code: 89 d8 49 8d 3c 1c 48 f7 d8 49 39 c4 72 b5 e8 1c ff ff ff 85 c0 78 ba 4c 89 e0 5b 5d 41 5c c3 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 6d b8 0c 00 f7 d8 64 89 0
[Bug 212739] New: [amdgpu] Sporadic GPU errors, screen artifacts and GPU-induced system lockups on Vega 10 (Raven Ridge)
https://bugzilla.kernel.org/show_bug.cgi?id=212739 Bug ID: 212739 Summary: [amdgpu] Sporadic GPU errors, screen artifacts and GPU-induced system lockups on Vega 10 (Raven Ridge) Product: Drivers Version: 2.5 Kernel Version: 5.11.14-1, 5.12.rc7.d0411.gd434405-1 Hardware: x86-64 OS: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: Video(DRI - non Intel) Assignee: drivers_video-...@kernel-bugs.osdl.org Reporter: tu...@cryptolab.net Regression: No Created attachment 296449 --> https://bugzilla.kernel.org/attachment.cgi?id=296449&action=edit Example of GPU artifacts from the recoverable variant of this error >From time to time, the amdgpu driver will report a page fault (sometimes coming from pid 0, sometimes coming from the web browser, sometimes the screen compositor or Xorg, sometimes a video player, etc.) as shown below: >kernel: amdgpu :05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 >ring:0 vmid:4 pasid:0, for process pid 0 thread pid 0) >kernel: amdgpu :05:00.0: amdgpu: in page starting at address >0x800101606000 from client 27 >kernel: amdgpu :05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00401031 >kernel: amdgpu :05:00.0: amdgpu: Faulty UTCL2 client ID: TCP >(0x8) >kernel: amdgpu :05:00.0: amdgpu: MORE_FAULTS: 0x1 >kernel: amdgpu :05:00.0: amdgpu: WALKER_ERROR: 0x0 >kernel: amdgpu :05:00.0: amdgpu: PERMISSION_FAULTS: 0x3 >kernel: amdgpu :05:00.0: amdgpu: MAPPING_ERROR: 0x0 >kernel: amdgpu :05:00.0: amdgpu: RW: 0x0` This message is repeated several thousand times in dmesg ("x callbacks suppressed") with different addresses of form 0x80010160Y000 (where Y is a hex digit between 1-8.) In the meantime, the computer is completely hung in terms of display, i.e. inputs go through, music keeps playing, but the screen is static. Then, several seconds later, it's followed by: >kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences >timed out! And finally, >[drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft >recovered After this, the computer resumes operation (but with GPU artifacts having appeared on the screen - for an example of these, see attached screenshot). Alternatively, sometimes instead of the soft recovery message, the GPU cannot recover and displays the following messages in the kernel log: >kernel: [drm:gfx_v9_0_priv_reg_irq [amdgpu]] *ERROR* Illegal register access >in command stream >kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled >seq=3356413, emitted seq=3356415 >kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: >process Xorg pid 14524 thread Xorg:cs0 pid 14539 >kernel: amdgpu :05:00.0: amdgpu: GPU reset begin! >kernel: [drm] free PSP TMR buffer >kernel: amdgpu :05:00.0: amdgpu: MODE2 reset >kernel: amdgpu :05:00.0: amdgpu: GPU reset succeeded, trying to resume >kernel: [drm] PCIE GART of 1024M enabled (table at 0x00F40090). >kernel: [drm] PSP is resuming... >kernel: [drm] reserve 0x40 from 0xf47fc0 for PSP TMR >kernel: amdgpu :05:00.0: amdgpu: RAS: optional ras ta ucode is not >available >kernel: amdgpu :05:00.0: amdgpu: RAP: optional rap ta ucode is not >available >kernel: [drm] kiq ring mec 2 pipe 1 q 0 >kernel: amdgpu :05:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* >ring sdma0 test failed (-110) >kernel: [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP >block failed -110 >kernel: amdgpu :05:00.0: amdgpu: GPU reset(4) failed >kernel: amdgpu :05:00.0: amdgpu: GPU reset end with ret = -110 at which point rebooting is necessary as the GPU will not resume operation. This also happens on the latest 5.12 rc (as of the writing of this bug report, this is rc7). -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 211807] [drm:drm_dp_mst_dpcd_read] *ERROR* mstb 000000004e6288dd port 3: DPCD read on addr 0x60 for 1 bytes NAKed
https://bugzilla.kernel.org/show_bug.cgi?id=211807 n...@disroot.org changed: What|Removed |Added CC||n...@disroot.org --- Comment #1 from n...@disroot.org --- I have just discovered after months of searching that I recieve this same error. I have a desktop computer than uses both a AMD WX3100 and a AMD RX570. When I turn off my monitor connected to either of my graphics cards (an effective hotplug), then turn my monitor back on, I recieve the error [drm:drm_dp_mst_dpcd_read [drm_kms_helper]] *ERROR* mstb 8ff61da7 port 8: DPCD read on addr 0x60 for 1 bytes NAKed xorg crashes, I recieve the message [drm:drm_dp_check_act_status [drm_kms_helper]] *ERROR* Failed to get ACT after 3000ms, last status: 01 and it takes up to 40 seconds for display of the vt to resume. The longer the monitor remains off, the higher the chance that display of the vt will never resume after the monitor is turned on, needing a hard restart. This error has persisted for about 3 months, ever since i've first installed linux on this computer, on every kernel variation and build I have tried to date. Currently reproduceable on 5.11.16. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 212469] plymouth animation freezes during shutdown
https://bugzilla.kernel.org/show_bug.cgi?id=212469 DieKleene (mail2021a...@detlef-pogrzeba.de) changed: What|Removed |Added CC||mail2021ac62@detlef-pogrzeb ||a.de --- Comment #4 from DieKleene (mail2021a...@detlef-pogrzeba.de) --- Similar problem Shutdown / restart: Computer hangs on shutdown or restart and shows the following symptoms: Watchdog error message and then takes up to 15 minutes to shut down / restart Or Computer shuts down with black screens (both monitors are switched off). But then alternately activates one of the two monitors (one remains switched off and the other is switched on). In addition: Occasionally only one of the two monitors is activated when the system is started. Or, both monitors are activated. But the attitudes regarding the position are no longer correct. Or: Mirror mode has been activated. My system: Arch Linux XFCE AMD Ryzen 7 2700 Eight-Core Processor AMD Radeon (TM) R9 380 Series (TONGA, DRM 3.40.0, 5.11.16-arch1-1, LLVM 11.1.0) The X.Org Foundation 1.20.11 Current Display Name: :0.0 Hersteller : The X.Org Foundation Version : 1.20.11 Release Number : 12011000 -Screens- Screen 0: 1920x2160 pixels -Outputs (XRandR)- DP-1: Disconnected; Unused HDMI-1 : Disconnected; Unused DVI-D-1 : Connected; 1920x1080 pixels, offset (0, 0) DVI-D-2 : Connected; 1920x1080 pixels, offset (0, 1080) -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 212871] New: AMD Radeon Pro VEGA 20 (Aka Vega12) - Glitch and freeze on any kernel and/or distro.
https://bugzilla.kernel.org/show_bug.cgi?id=212871 Bug ID: 212871 Summary: AMD Radeon Pro VEGA 20 (Aka Vega12) - Glitch and freeze on any kernel and/or distro. Product: Drivers Version: 2.5 Kernel Version: Any Hardware: x86-64 OS: Linux Tree: Mainline Status: NEW Severity: blocking Priority: P1 Component: Video(DRI - non Intel) Assignee: drivers_video-...@kernel-bugs.osdl.org Reporter: rodrigo.lug...@icloud.com Regression: No I have a macbook pro with vega 20 which uses the amdgpu firmware vega12 and when i boot any distro the graphics glitch and the computer freezes. If i install amdgpu pro on ubuntu it works flawlessly. Would you guys help me debug this and fix for upstream? Let me know what can I send to complement the information required for analysis, like logs or dmesg. I would be very happy to help and participate on this. Please, excuse me if this is not the right place for me to ask this kind of thing, and please if you can, kindly redirect me to the right place. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 212871] AMD Radeon Pro VEGA 20 (Aka Vega12) - Glitch and freeze on any kernel and/or distro.
https://bugzilla.kernel.org/show_bug.cgi?id=212871 Alex Deucher (alexdeuc...@gmail.com) changed: What|Removed |Added CC||alexdeuc...@gmail.com --- Comment #1 from Alex Deucher (alexdeuc...@gmail.com) --- amdgpu pro uses the same driver as upstream, just packaged so that you can install it on enterprise distros, so the code is the same. What driver package version did you use? What upstream kernels have you tried? Please include the dmesg output from the working and non-working cases. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 212871] AMD Radeon Pro VEGA 20 (Aka Vega12) - Glitch and freeze on any kernel and/or distro.
https://bugzilla.kernel.org/show_bug.cgi?id=212871 --- Comment #3 from Alex Deucher (alexdeuc...@gmail.com) --- Note that you don't need to file two bugs. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 212871] AMD Radeon Pro VEGA 20 (Aka Vega12) - Glitch and freeze on any kernel and/or distro.
https://bugzilla.kernel.org/show_bug.cgi?id=212871 --- Comment #2 from Alex Deucher (alexdeuc...@gmail.com) --- Also filed as: https://gitlab.freedesktop.org/drm/amd/-/issues/1582 -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 212881] New: nouveau: BUG: kernel NULL pointer dereference in nouveau_bo_sync_for_device
https://bugzilla.kernel.org/show_bug.cgi?id=212881 Bug ID: 212881 Summary: nouveau: BUG: kernel NULL pointer dereference in nouveau_bo_sync_for_device Product: Drivers Version: 2.5 Kernel Version: 5.11.0 Hardware: Intel OS: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: Video(DRI - non Intel) Assignee: drivers_video-...@kernel-bugs.osdl.org Reporter: dave.muel...@gmx.ch Regression: No Since kernel 5.11, the nouveau driver crash with a NULL pointer exception. Below is the (stripped) dmesg output from kernel 5.12, which still shows the same problem. The same (old) hardware works fine with kernel 5.10. [0.00] Linux version 5.12.0 (root@test) (gcc (GCC) 5.5.0, GNU ld version 2.27-slack15) #2 PREEMPT Wed Apr 28 15:34:45 CEST 2021 [ 11.415023] nouveau :01:00.0: vgaarb: deactivate vga console [ 11.416766] Console: switching to colour dummy device 80x25 [ 11.417616] nouveau :01:00.0: NVIDIA NV17 (017200a5) [ 11.517414] nouveau :01:00.0: bios: version 04.17.00.45.00 [ 11.518545] agpgart-intel :00:00.0: AGP 2.0 bridge [ 11.518692] agpgart-intel :00:00.0: putting AGP V2 device into 4x mode [ 11.518748] nouveau :01:00.0: putting AGP V2 device into 4x mode [ 11.518918] agpgart-intel :00:00.0: AGP 2.0 bridge [ 11.518941] agpgart-intel :00:00.0: putting AGP V2 device into 4x mode [ 11.518977] nouveau :01:00.0: putting AGP V2 device into 4x mode [ 11.519038] nouveau :01:00.0: timer: unknown input clock freq [ 11.521866] nouveau :01:00.0: fb: 64 MiB DDR1 [ 11.522352] sr 2:0:0:0: Attached scsi CD-ROM sr1 [ 11.534498] [TTM] Zone kernel: Available graphics memory: 569334 KiB [ 11.534524] [TTM] Zone highmem: Available graphics memory: 1033956 KiB [ 11.534581] nouveau :01:00.0: DRM: VRAM: 63 MiB [ 11.534593] nouveau :01:00.0: DRM: GART: 128 MiB [ 11.534604] nouveau :01:00.0: DRM: BMP version 5.21 [ 11.534614] nouveau :01:00.0: DRM: DCB version 2.0 [ 11.534625] nouveau :01:00.0: DRM: DCB outp 00: 01000100 88b8 [ 11.534636] nouveau :01:00.0: DRM: DCB outp 01: 02010111 0003 [ 11.534646] nouveau :01:00.0: DRM: DCB outp 02: 02010211 0003 [ 11.534657] nouveau :01:00.0: DRM: Merging DCB entries 1 and 2 [ 11.535551] nouveau :01:00.0: DRM: Loading NV17 power sequencing microcode [ 11.536036] BUG: kernel NULL pointer dereference, address: [ 11.536053] #PF: supervisor read access in kernel mode [ 11.536062] #PF: error_code(0x) - not-present page [ 11.536070] *pde = [ 11.536079] Oops: [#1] PREEMPT [ 11.536089] CPU: 0 PID: 388 Comm: udevd Not tainted 5.12.0 #2 [ 11.536099] Hardware name: Dell Computer Corporation Dimension 4550 /, BIOS A08 09/23/2003 [ 11.536110] EIP: nouveau_bo_sync_for_device+0x91/0x103 [nouveau] [ 11.536393] Code: 85 8b 00 00 00 bb 01 00 00 00 eb 08 83 c2 24 3b 14 87 75 0a 83 c3 01 83 c0 01 39 c1 77 ee 89 d9 c1 e1 0c 8b 45 10 8b 7c 24 10 <8b> 14 38 8b 44 24 14 8b 40 e0 8b 40 08 c7 04 24 01 00 00 00 e8 7b [ 11.536411] EAX: EBX: 0010 ECX: 0001 EDX: f624e040 [ 11.536421] ESI: EDI: EBP: b20f5cc0 ESP: b1c83b44 [ 11.536430] DS: 007b ES: 007b FS: GS: 00e0 SS: 0068 EFLAGS: 00010206 [ 11.536439] CR0: 80050033 CR2: CR3: 01c63000 CR4: 06d0 [ 11.536449] Call Trace: [ 11.536459] ? nouveau_bo_validate+0x5d/0x77 [nouveau] [ 11.536780] ? nouveau_bo_pin+0x10e/0x287 [nouveau] [ 11.537099] ? nouveau_bo_new+0x64/0x74 [nouveau] [ 11.537418] ? nouveau_channel_prep+0x113/0x32e [nouveau] [ 11.537744] ? nouveau_channel_new+0x59/0x75d [nouveau] [ 11.538067] ? ttm_bo_kmap+0x1a7/0x1f0 [ttm] [ 11.538084] ? ttm_bo_kmap+0x1a7/0x1f0 [ttm] [ 11.538098] ? nouveau_bo_pin+0x150/0x287 [nouveau] [ 11.538417] ? nouveau_bo_map+0x75/0x8b [nouveau] [ 11.538736] ? nvif_object_sclass_put+0xa/0x12 [nouveau] [ 11.538976] ? kfree+0x66/0xca [ 11.538992] ? nouveau_drm_device_init+0x42f/0x7d7 [nouveau] [ 11.539311] ? pci_read_config_word+0x27/0x2c [ 11.539325] ? pci_enable_device_flags+0xd0/0xe5 [ 11.539335] ? nouveau_drm_probe+0x110/0x1e5 [nouveau] [ 11.539654] ? nouveau_drm_device_init+0x7d7/0x7d7 [nouveau] [ 11.539973] ? pci_device_probe+0x82/0xf0 [ 11.539984] ? sysfs_create_link+0x1d/0x2e [ 11.539996] ? really_probe+0x19a/0x341 [ 11.540007] ? driver_probe_device+0x3d/0x85 [ 11.540016] ? device_driver_attach+0x3c/0x40 [ 11.540024] ? __driver_attach+0x6f/0x95 [ 11.540033] ? device_driver_attach+0x40/0x40 [ 11.540041] ? bus_for_each_dev+0x48/0x88 [ 11.540049] ? driver_attach+0x16/0x1a [ 11.540056] ? device_driver_attach+0x40/0x40 [ 11.540065] ? bus_add_driver+0x111/0x1b4 [ 11.540073] ? driver_register+0x51/0xe5 [
[Bug 212881] nouveau: BUG: kernel NULL pointer dereference in nouveau_bo_sync_for_device
https://bugzilla.kernel.org/show_bug.cgi?id=212881 dave.muel...@gmx.ch changed: What|Removed |Added Regression|No |Yes -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 212107] Temperature increase by 15°C on radeon gpu
https://bugzilla.kernel.org/show_bug.cgi?id=212107 Martin (martin...@gmx.com) changed: What|Removed |Added Status|CLOSED |REOPENED Resolution|ANSWERED|--- --- Comment #9 from Martin (martin...@gmx.com) --- Hello, is it possible to return to the behaviour from version 5.10? Back then my gpu was cool and quiet. I'm running 5.11.17 currently and temperature on the GPU gets to 70°C but fan is at like 300rpm. The above is without touching anything in /sys/class/drm/card0/device/hwmon/hwmon1 When I disable fan control by putting 0 in /sys/class/drm/card0/device/hwmon/hwmon1/pwm1_enable the fan spins up to top speed. 2 keeps it running at 2000rpm and it's loud. Which is strange because after booting it's 2. Ideally it would be great if I could return to how it worked on 5.10 -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 212107] Temperature increase by 15°C on radeon gpu
https://bugzilla.kernel.org/show_bug.cgi?id=212107 --- Comment #10 from Martin (martin...@gmx.com) --- (In reply to Martin from comment #9) > > I'm running 5.11.17 currently and temperature on the GPU gets to 70°C but > fan is at like 300rpm. > This isn't always reproducible. I thought it may be related to suspending my PC but in last few days the temperature is kept around 55°C -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 211277] sometimes crash at s2ram-wake (Ryzen 3500U): amdgpu, drm, commit_tail, amdgpu_dm_atomic_commit_tail
https://bugzilla.kernel.org/show_bug.cgi?id=211277 --- Comment #35 from kolAflash (kolafl...@kolahilft.de) --- Created attachment 298193 --> https://bugzilla.kernel.org/attachment.cgi?id=298193&action=edit /var/log/kern.log running amd-drm-next-5.14-2021-05-12 (ae30d41eb) with Xorg Sorry for the long delay. I've tested: 1. Current Debian-11 testing Linux-5.10.0-8 with amdgpu.ip_block_mask=0x0ff while running Xorg. Result: everything ok 2. amd-drm-next-5.14-2021-05-12* (ae30d41eb) without any special kernel options while running Xorg. Result: - crashes - also the screen starts flickering about every 10 seconds after second resume - flickering also happens with using a8f768874^ (before the first fix-commit by Alex D.) - log attached: 5.12.0-rc7-original-ae30d41eb_crash.txt 3. Upstream Linux-5.14.0-rc4. Result: Still broken. * amd-drm-next-5.14-2021-05-12 https://gitlab.freedesktop.org/agd5f/linux/-/tree/amd-drm-next-5.14-2021-05-12 ae30d41eb -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 208981] trace with B550I AORUS PRO AX and AMD Ryzen 5 PRO 4650G
https://bugzilla.kernel.org/show_bug.cgi?id=208981 --- Comment #10 from Florian La Roche (florian.laro...@gmail.com) --- This seems to be fixed after updating to BIOS F12 from 2021-01-18, BIOS Revision: 5.17. There are even newer BIOS revisions available, but they only work with RAM at 2133 MT/s instead of the usual 3200 MT/s and seem to be unstable. best regards, Florian La Roche -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 211277] sometimes crash at s2ram-wake (Ryzen 3500U): amdgpu, drm, commit_tail, amdgpu_dm_atomic_commit_tail
https://bugzilla.kernel.org/show_bug.cgi?id=211277 --- Comment #36 from Jerome C (m...@jeromec.com) --- I've been watching linux-next and noticed that this commit https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/drivers/gpu/drm/amd?id=65660ad349fd947feb16b45ff9231f2ceaf44318 was posted on linux-next back between 5.10-5.11, I don't remember but it keeps getting pushed back and not mainlined... I think this is why the issues are still here and none of AMD are responding to this now since comment 29 -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 213201] [KAVERI] memory leak - unreferenced object 0xffff8881700cf988 (size 56)
https://bugzilla.kernel.org/show_bug.cgi?id=213201 Erhard F. (erhar...@mailbox.org) changed: What|Removed |Added Attachment #296969|0 |1 is obsolete|| Attachment #297925|0 |1 is obsolete|| --- Comment #11 from Erhard F. (erhar...@mailbox.org) --- Created attachment 298215 --> https://bugzilla.kernel.org/attachment.cgi?id=298215&action=edit kernel .config (5.14-rc4, AMD A10-7860K) -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 213201] [KAVERI] memory leak - unreferenced object 0xffff8881700cf988 (size 56)
https://bugzilla.kernel.org/show_bug.cgi?id=213201 --- Comment #12 from Erhard F. (erhar...@mailbox.org) --- Created attachment 298217 --> https://bugzilla.kernel.org/attachment.cgi?id=298217&action=edit kernel dmesg (5.14-rc4, AMD A10-7860K) -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 213201] [KAVERI] memory leak - unreferenced object 0xffff8881700cf988 (size 56)
https://bugzilla.kernel.org/show_bug.cgi?id=213201 Erhard F. (erhar...@mailbox.org) changed: What|Removed |Added Attachment #297921|0 |1 is obsolete|| --- Comment #13 from Erhard F. (erhar...@mailbox.org) --- Created attachment 298219 --> https://bugzilla.kernel.org/attachment.cgi?id=298219&action=edit output of /sys/kernel/debug/kmemleak (kernel 5.14-rc4) Same board, another CPU: unreferenced object 0x888102e48bd0 (size 56): comm "systemd-udevd", pid 199, jiffies 4294881489 (age 3502.134s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 0d 01 70 00 00 00 00 00 ..p. 7b 5d 02 00 00 c9 ff ff 00 00 00 00 00 00 00 00 {].. backtrace: [] kmem_cache_alloc+0x109/0x132 [] acpi_ps_alloc_op+0x8b/0x1c4 [] acpi_ps_create_op+0x4b1/0x8ec [] acpi_ps_parse_loop+0x401/0x1062 [] acpi_ps_parse_aml+0x1cd/0x6fa [] acpi_ps_execute_method+0x51f/0x57b [] acpi_ns_evaluate+0x64c/0x886 [] acpi_evaluate_object+0x335/0x690 [] amdgpu_atcs_call.constprop.0+0x141/0x1bd [amdgpu] [] amdgpu_atcs_pci_probe_handle.isra.0+0x147/0x2a1 [amdgpu] [] amdgpu_acpi_detect+0xd1/0x38e [amdgpu] [] 0xc1c0c0aa [] do_one_initcall+0xe0/0x1fc [] do_init_module+0x1c1/0x584 [] load_module+0x4ea2/0x5cc6 [] __do_sys_finit_module+0xf6/0x145 unreferenced object 0x888102e48480 (size 56): comm "systemd-udevd", pid 199, jiffies 4294881489 (age 3502.134s) hex dump (first 32 bytes): d0 8b e4 02 81 88 ff ff 0d 01 2d 00 00 00 00 00 ..-. 7c 5d 02 00 00 c9 ff ff 00 00 00 00 00 00 00 00 |].. backtrace: [] kmem_cache_alloc+0x109/0x132 [] acpi_ps_alloc_op+0x8b/0x1c4 [] acpi_ps_create_op+0x4b1/0x8ec [] acpi_ps_parse_loop+0x401/0x1062 [] acpi_ps_parse_aml+0x1cd/0x6fa [] acpi_ps_execute_method+0x51f/0x57b [] acpi_ns_evaluate+0x64c/0x886 [] acpi_evaluate_object+0x335/0x690 [] amdgpu_atcs_call.constprop.0+0x141/0x1bd [amdgpu] [] amdgpu_atcs_pci_probe_handle.isra.0+0x147/0x2a1 [amdgpu] [] amdgpu_acpi_detect+0xd1/0x38e [amdgpu] [] 0xc1c0c0aa [] do_one_initcall+0xe0/0x1fc [] do_init_module+0x1c1/0x584 [] load_module+0x4ea2/0x5cc6 [] __do_sys_finit_module+0xf6/0x145 -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 213983] New: WARNING: CPU: 3 PID: 520 at drivers/gpu/drm/ttm/ttm_bo.c:409 ttm_bo_release+0x7a/0x803 [ttm]
https://bugzilla.kernel.org/show_bug.cgi?id=213983 Bug ID: 213983 Summary: WARNING: CPU: 3 PID: 520 at drivers/gpu/drm/ttm/ttm_bo.c:409 ttm_bo_release+0x7a/0x803 [ttm] Product: Drivers Version: 2.5 Kernel Version: 5.14-rc4 Hardware: All OS: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: Video(DRI - non Intel) Assignee: drivers_video-...@kernel-bugs.osdl.org Reporter: erhar...@mailbox.org Regression: No Created attachment 298221 --> https://bugzilla.kernel.org/attachment.cgi?id=298221&action=edit kernel dmesg (5.14-rc4, AMD A10-7860K) System was running fine for hours but got this at reboot: [...] [35939.202358] [ cut here ] [35939.202603] WARNING: CPU: 3 PID: 520 at drivers/gpu/drm/ttm/ttm_bo.c:409 ttm_bo_release+0x7a/0x803 [ttm] [35939.202870] Modules linked in: rfkill dm_crypt nhpoly1305_sse2 nhpoly1305 chacha_generic chacha_x86_64 libchacha adiantum libpoly1305 algif_skcipher input_leds joydev led_class dm_mod hid_generic usbhid hid raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx amdgpu md_mod evdev f2fs crc32_generic lz4hc_compress lz4_compress lz4_decompress edac_mce_amd crc32_pclmul ext4 crc16 mbcache jbd2 ohci_pci snd_hda_codec_hdmi drm_ttm_helper ttm aesni_intel snd_hda_intel mfd_core libaes crypto_simd cryptd gpu_sched snd_intel_dspcfg i2c_algo_bit k10temp snd_hda_codec fam15h_power i2c_piix4 snd_hwdep drm_kms_helper snd_hda_core ohci_hcd cfbfillrect ehci_pci xhci_pci xhci_hcd ehci_hcd syscopyarea cfbimgblt snd_pcm sysfillrect sysimgblt usbcore fb_sys_fops snd_timer cfbcopyarea usb_common snd soundcore video acpi_cpufreq processor button zram nct6775 zsmalloc hwmon_vid hwmon drm nfsd fb fuse font fbdev auth_rpcgss drm_panel_orientation_quirks backlight lockd grace configfs sunrpc [35939.203200] efivarfs [35939.205121] CPU: 3 PID: 520 Comm: X Not tainted 5.14.0-rc4-bdver3 #2 [35939.212327] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./A88M-G/3.1, BIOS P1.40C 11/21/2016 [35939.219364] RIP: 0010:ttm_bo_release+0x7a/0x803 [ttm] [35939.219387] Code: e0 2a 48 c1 ea 03 8a 14 02 48 8b 04 24 83 e0 07 83 c0 03 38 d0 7c 0d 84 d2 74 09 48 8b 3c 24 e8 99 eb 7c dd 83 7b 4c 00 74 02 <0f> 0b 48 8d 43 18 48 89 c2 48 89 44 24 10 b8 ff ff 37 00 48 c1 ea [35939.230367] RSP: 0018:c900050dfb08 EFLAGS: 00010202 [35939.230376] RAX: 0007 RBX: 8881716441f0 RCX: 8881716441f0 [35939.230381] RDX: 11102e2c8800 RSI: 0004 RDI: 8881716441f0 [35939.230387] RBP: 8881716441d8 R08: 0001 R09: ed102e2c883f [35939.230391] R10: 8881716441f3 R11: c0b6a201 R12: 88811510 [35939.264555] R13: 888115005608 R14: 888171644058 R15: 9f5cf160 [35939.264564] FS: 7fe7325f9980() GS:8883e058() knlGS: [35939.264572] CS: 0010 DS: ES: CR0: 80050033 [35939.264579] CR2: 7fe73190d0e8 CR3: 000141978000 CR4: 000506e0 [35939.264586] Call Trace: [35939.264591] ? fsnotify_grab_connector+0x8c/0x93 [35939.264608] amdgpu_bo_unref+0x30/0x57 [amdgpu] [35939.318763] amdgpu_gem_object_free+0x69/0x95 [amdgpu] [35939.319132] ? list_add+0xd1/0xd1 [amdgpu] [35939.319478] ? test_bit+0x1d/0x27 [35939.319489] drm_gem_dmabuf_release+0x5b/0x67 [drm] [35939.319551] dma_buf_release+0x113/0x1b6 [35939.319563] __dentry_kill+0x29e/0x302 [35939.319573] dput+0x184/0x1c3 [35939.319582] __fput+0x4dc/0x598 [35939.319590] task_work_run+0xfa/0x118 [35939.319599] do_exit+0x984/0x17ba [35939.319609] ? mm_update_next_owner+0x3fd/0x3fd [35939.319619] do_group_exit+0x229/0x229 [35939.319627] __x64_sys_exit_group+0x39/0x39 [35939.319635] do_syscall_64+0x75/0x88 [35939.319649] ? do_syscall_64+0xe/0x88 [35939.319658] entry_SYSCALL_64_after_hwframe+0x44/0xae [35939.319668] RIP: 0033:0x7fe731edd459 [35939.319676] Code: Unable to access opcode bytes at RIP 0x7fe731edd42f. [35939.319680] RSP: 002b:7ffc8d3e1298 EFLAGS: 0246 ORIG_RAX: 00e7 [35939.319691] RAX: ffda RBX: 7fe731fc6920 RCX: 7fe731edd459 [35939.319698] RDX: 003c RSI: 00e7 RDI: [35939.319704] RBP: 7fe731fc6920 R08: fd40 R09: 5630715eb7c0 [35939.319711] R10: 7fe710db0a14 R11: 0246 R12: [35939.319716] R13: R14: 0668 R15: [35939.319724] ---[ end trace 0f92591c8b7a0f11 ]--- # inxi -bZ System:Kernel: 5.14.0-rc4-bdver3 x86_64 bits: 64 Desktop: MATE 1.24.1 Distro: Gentoo Base System release 2.7 Machine: Type: Desktop Mobo: ASRock model: A88M-G/3.1 serial: UEFI: American Megatrends v: P1.40C date: 11/21/2016 CPU: Info: Quad Core AMD A10-7860K Radeon R7 12 Compute Cores 4C+8G
[Bug 213983] WARNING: CPU: 3 PID: 520 at drivers/gpu/drm/ttm/ttm_bo.c:409 ttm_bo_release+0x7a/0x803 [ttm]
https://bugzilla.kernel.org/show_bug.cgi?id=213983 --- Comment #1 from Erhard F. (erhar...@mailbox.org) --- Created attachment 298223 --> https://bugzilla.kernel.org/attachment.cgi?id=298223&action=edit kernel .config (5.14-rc4, AMD A10-7860K) -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 214001] New: [bisected][regression] After commit "drm/ttm: Initialize debugfs from ttm_global_init()" kernels without debugfs explicitly set to 'allow all' fail to boot
https://bugzilla.kernel.org/show_bug.cgi?id=214001 Bug ID: 214001 Summary: [bisected][regression] After commit "drm/ttm: Initialize debugfs from ttm_global_init()" kernels without debugfs explicitly set to 'allow all' fail to boot Product: Drivers Version: 2.5 Kernel Version: 5.14-rc4 Hardware: All OS: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: Video(DRI - non Intel) Assignee: drivers_video-...@kernel-bugs.osdl.org Reporter: untaintablean...@hotmail.co.uk Regression: No So this is an interesting one. Problem: System hangs indefinitely/refuses to boot up. 5.14rc3 was totally fine but rc4 has the problem and I've bisected the commit to: 69de4421bb4c103ef42a32bafc596e23918c106f is the first bad commit commit 69de4421bb4c103ef42a32bafc596e23918c106f Author: Jason Ekstrand Date: Wed Jul 21 10:23:57 2021 -0500 drm/ttm: Initialize debugfs from ttm_global_init() We create a bunch of debugfs entries as a side-effect of ttm_global_init() and then never clean them up. This isn't usually a problem because we free the whole debugfs directory on module unload. However, if the global reference count ever goes to zero and then ttm_global_init() is called again, we'll re-create those debugfs entries and debugfs will complain in dmesg that we're creating entries that already exist. This patch fixes this problem by changing the lifetime of the whole TTM debugfs directory to match that of the TTM global state. Signed-off-by: Jason Ekstrand Reviewed-by: Daniel Vetter Signed-off-by: Daniel Vetter Link: https://patchwork.freedesktop.org/patch/msgid/20210721152358.2893314-6-ja...@jlekstrand.net I then tried loading an ubuntu mainline kernel for 5.14-rc4 and that was fine too, which meant my .config was to blame in conjunction with the change. The specific issue narrowed down to not having debug_fs enabled in my kernel (CONFIG_DEBUG_FS is not set) Now I've not had debugfs enabled for many, many years (is this even necessary on a kernel on which the user makes no use of the information it provides?) and now I see the option CONFIG_DEBUG_FS=y allows for one of three exclusive options. (CONFIG_DEBUG_FS_ALLOW_ALL; CONFIG_DEBUG_FS_DISALLOW_MOUNT and CONFIG_DEBUG_FS_ALLOW_NONE) (*Moving forward, is debug_fs now a critical component of the linux kernel and required to be enabled (CONFIG_DEBUG_FS=Y) with a minimum of the 3rd option of 'allow none' given that so many things want to make use of it? Is debugfs 'expected' to be there to make reference to in driver code from now on?) At any rate, I tested each of the 3 options and can confirm that since the commit in question, the system will _only_ boot now if: CONFIG_DEBUG_FS_ALLOW_ALL=y I suspect that the commit did not account for kernel compilers who don't have debugfs at all - however, it even causes boot issues if debugfs is present but minimalised because neither: CONFIG_DEBUG_FS_DISALLOW_MOUNT "The API is open but filesystem is not loaded. Clients can still do their work and read with debug tools that do not need debugfs filesystem." nor CONFIG_DEBUG_FS_ALLOW_NONE: "Access is off. Clients get -PERM when trying to create nodes in debugfs tree and debugfs is not registered as a filesystem. Client can then back-off or continue without debugfs access." are sufficient to get a successful boot after this commit. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 214001] [bisected][regression] After commit "drm/ttm: Initialize debugfs from ttm_global_init()" kernels without debugfs explicitly set to 'allow all' fail to boot
https://bugzilla.kernel.org/show_bug.cgi?id=214001 Linux_Chemist (untaintablean...@hotmail.co.uk) changed: What|Removed |Added Regression|No |Yes -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 214001] [bisected][regression] After commit "drm/ttm: Initialize debugfs from ttm_global_init()" kernels without debugfs explicitly set to 'allow all' fail to boot
https://bugzilla.kernel.org/show_bug.cgi?id=214001 --- Comment #1 from Linux_Chemist (untaintablean...@hotmail.co.uk) --- As an addendum, I suppose a slight source of confusion is the info for CONFIG_DEBUG_FS which reads: "debugfs is a virtual file system that kernel developers use to put debugging files into. Enable this option to be able to read and write to these files. For detailed documentation on the debugfs API, see Documentation/filesystems/. If unsure, say N." which implies: a) that it isn't strictly necessary to have enabled in order to boot/run normally (highlighting this bug) and b) that you would have zero need for it if you weren't reading/writing to these debugging files. To then have the option to enable debugfs but only run minimally with CONFIG_DEBUG_FS_ALLOW_NONE: "Access is off. Clients get -PERM when trying to create nodes in debugfs tree and debugfs is not registered as a filesystem. Client can then back-off or continue without debugfs access." leaves the question of 'why have it on and set to "allow none" rather than off completely?' -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 214001] [bisected][regression] After commit "drm/ttm: Initialize debugfs from ttm_global_init()" kernels without debugfs explicitly set to 'allow all' fail to boot
https://bugzilla.kernel.org/show_bug.cgi?id=214001 Linux_Chemist (untaintablean...@hotmail.co.uk) changed: What|Removed |Added Component|Video(DRI - non Intel) |Video(Other) -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 214029] New: [NAVI] Several memory leaks in amdgpu and ttm
https://bugzilla.kernel.org/show_bug.cgi?id=214029 Bug ID: 214029 Summary: [NAVI] Several memory leaks in amdgpu and ttm Product: Drivers Version: 2.5 Kernel Version: 5.14-rc5 Hardware: All OS: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: Video(DRI - non Intel) Assignee: drivers_video-...@kernel-bugs.osdl.org Reporter: erhar...@mailbox.org CC: dan...@ffwll.ch Regression: No Created attachment 298265 --> https://bugzilla.kernel.org/attachment.cgi?id=298265&action=edit kernel dmesg (kernel 5.14-rc5, AMD FX-8370) Getting this on kernel 5.14-rc5 with my Radeon RX 5500. unreferenced object 0x888169af1b40 (size 216): comm "lightdm-gtk-gre", pid 662, jiffies 4294902381 (age 13444.937s) hex dump (first 32 bytes): d0 1b af 69 81 88 ff ff 60 cb b9 c0 ff ff ff ff ...i`... 80 73 48 e1 13 00 00 00 58 7d c1 0b 00 c9 ff ff .sH.X}.. backtrace: [] drm_sched_fence_create+0x1f/0x1d0 [gpu_sched] [] drm_sched_job_init+0x10e/0x240 [gpu_sched] [] amdgpu_job_submit+0x27/0x2d0 [amdgpu] [] amdgpu_copy_buffer+0x49e/0x700 [amdgpu] [] amdgpu_ttm_copy_mem_to_mem+0x5fa/0xf00 [amdgpu] [] amdgpu_bo_move+0x356/0x2180 [amdgpu] [] ttm_bo_handle_move_mem+0x1c7/0x620 [ttm] [] ttm_bo_validate+0x2c7/0x450 [ttm] [] amdgpu_bo_fault_reserve_notify+0x2a4/0x640 [amdgpu] [] amdgpu_gem_fault+0x123/0x2d0 [amdgpu] [] __do_fault+0xf3/0x3e0 [] __handle_mm_fault+0x1bcb/0x2ac0 [] handle_mm_fault+0x12a/0x490 [] do_user_addr_fault+0x259/0xb70 [] exc_page_fault+0x55/0xb0 [] asm_exc_page_fault+0x1b/0x20 unreferenced object 0x888263377700 (size 72): comm "sdma0", pid 345, jiffies 4294902381 (age 13444.937s) hex dump (first 32 bytes): f0 f3 5c 69 81 88 ff ff 80 8a cf c1 ff ff ff ff ..\i f2 a0 4c e1 13 00 00 00 58 28 9b c9 81 88 ff ff ..L.X(.. backtrace: [] amdgpu_fence_emit+0x91/0x790 [amdgpu] [] amdgpu_ib_schedule+0x8cb/0x12f0 [amdgpu] [] amdgpu_job_run+0x35e/0x790 [amdgpu] [] drm_sched_main+0x64e/0xc60 [gpu_sched] [] kthread+0x342/0x410 [] ret_from_fork+0x22/0x30 unreferenced object 0x88811314b9c0 (size 216): comm "mate-session-ch", pid 768, jiffies 4294905408 (age 13434.854s) hex dump (first 32 bytes): 50 ba 14 13 81 88 ff ff 60 cb b9 c0 ff ff ff ff P...`... dc 7a c1 3a 16 00 00 00 58 7d c1 0b 00 c9 ff ff .z.:X}.. backtrace: [] drm_sched_fence_create+0x1f/0x1d0 [gpu_sched] [] drm_sched_job_init+0x10e/0x240 [gpu_sched] [] amdgpu_job_submit+0x27/0x2d0 [amdgpu] [] amdgpu_copy_buffer+0x49e/0x700 [amdgpu] [] amdgpu_ttm_copy_mem_to_mem+0x5fa/0xf00 [amdgpu] [] amdgpu_bo_move+0x356/0x2180 [amdgpu] [] ttm_bo_handle_move_mem+0x1c7/0x620 [ttm] [] ttm_bo_validate+0x2c7/0x450 [ttm] [] amdgpu_bo_fault_reserve_notify+0x2a4/0x640 [amdgpu] [] amdgpu_gem_fault+0x123/0x2d0 [amdgpu] [] __do_fault+0xf3/0x3e0 [] __handle_mm_fault+0x1bcb/0x2ac0 [] handle_mm_fault+0x12a/0x490 [] do_user_addr_fault+0x259/0xb70 [] exc_page_fault+0x55/0xb0 [] asm_exc_page_fault+0x1b/0x20 unreferenced object 0x888167ffc340 (size 72): comm "sdma0", pid 345, jiffies 4294905408 (age 13434.854s) hex dump (first 32 bytes): f0 f3 5c 69 81 88 ff ff 80 8a cf c1 ff ff ff ff ..\i ac b5 c5 3a 16 00 00 00 58 e4 a7 01 81 88 ff ff ...:X... backtrace: [] amdgpu_fence_emit+0x91/0x790 [amdgpu] [] amdgpu_ib_schedule+0x8cb/0x12f0 [amdgpu] [] amdgpu_job_run+0x35e/0x790 [amdgpu] [] drm_sched_main+0x64e/0xc60 [gpu_sched] [] kthread+0x342/0x410 [] ret_from_fork+0x22/0x30 unreferenced object 0x888113b6d240 (size 216): comm "mate-screensave", pid 57770, jiffies 4295052030 (age 12946.214s) hex dump (first 32 bytes): d0 d2 b6 13 81 88 ff ff 60 cb b9 c0 ff ff ff ff `... a2 85 ff 05 88 00 00 00 58 7d c1 0b 00 c9 ff ff X}.. backtrace: [] drm_sched_fence_create+0x1f/0x1d0 [gpu_sched] [] drm_sched_job_init+0x10e/0x240 [gpu_sched] [] amdgpu_job_submit+0x27/0x2d0 [amdgpu] [] amdgpu_copy_buffer+0x49e/0x700 [amdgpu] [] amdgpu_ttm_copy_mem_to_mem+0x5fa/0xf00 [amdgpu] [] amdgpu_bo_move+0x356/0x2180 [amdgpu] [] ttm_bo_handle_move_mem+0x1c7/0x620 [ttm] [] ttm_bo_validate+0x2c7/0x450 [ttm] [] amdgpu_bo_fault_reserve_notify+0x2a4/0x640 [amdgpu] [] amdgpu_gem_fault+0x123/0x2d0 [amdgpu] [] __do_fault+0xf3/0x3e0 [] __handle_mm_fault+0x1bcb/0x2ac0 [] handle_mm_fault+0x12a/0x490 [] do_user_addr_fault+0x259/0xb70 [] exc_page_fault+0x55/0xb0 [] asm_exc_page_fault+0x1b/0x20 unreferenced object 0x8881c85d6e80 (size 72): comm "sdma0", pid 345, jiffies 4295052030 (age 12946.217s) hex dump (first 32 bytes): f0 f3
[Bug 214029] [NAVI] Several memory leaks in amdgpu and ttm
https://bugzilla.kernel.org/show_bug.cgi?id=214029 --- Comment #1 from Erhard F. (erhar...@mailbox.org) --- Created attachment 298267 --> https://bugzilla.kernel.org/attachment.cgi?id=298267&action=edit output of kmemleak (kernel 5.14-rc5, AMD FX-8370) -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 214029] [NAVI] Several memory leaks in amdgpu and ttm
https://bugzilla.kernel.org/show_bug.cgi?id=214029 --- Comment #2 from Erhard F. (erhar...@mailbox.org) --- Created attachment 298269 --> https://bugzilla.kernel.org/attachment.cgi?id=298269&action=edit kernel .config (kernel 5.14-rc5, AMD FX-8370) -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 213935] AMDGPU Renoir crash/freeze while using vaapi with some video types in some apps - drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!
https://bugzilla.kernel.org/show_bug.cgi?id=213935 Fabian (fabisc...@mailbox.org) changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |ANSWERED --- Comment #2 from Fabian (fabisc...@mailbox.org) --- You're right. I was able to test with a more up-2-date version and it wont happen anymore. (21.1.6). Sadly this doesn't fix my system freeze when using vaapi in firefox. But that seems to be another bug. Thank you for your help :) This one is resolved -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 214001] [bisected][regression] After commit "drm/ttm: Initialize debugfs from ttm_global_init()" kernels without debugfs explicitly set to 'allow all' fail to boot
https://bugzilla.kernel.org/show_bug.cgi?id=214001 Duncan (1i5t5.dun...@cox.net) changed: What|Removed |Added CC||1i5t5.dun...@cox.net --- Comment #2 from Duncan (1i5t5.dun...@cox.net) --- This has been reported (by someone else) on the dri-devel list (with the main kernel list and the devs CCed) as well, with me confirming it there. No answer from the devs there either. The reporter and I followed reporting instructions to take it to the list, and no hint it was even seen, despite the release getting closer and closer. So I was going to try bugzilla (despite instructions to take it to the list), to see if I could raise the profile a bit, and find this bug. Anyway, it's on both channels now. FWIW: https://lore.kernel.org/dri-devel/?q=5.14.0-rc4+broke+drm%2Fttm Tho FWIW your symptoms are a bit different than those of the OP there and I. We were able to boot, but only to legacy low-res VGA mode. He has a boot-splash enabled and the screen blanked from early boot when the drm-framebuffer would normally take over until the login prompt, which appeared in vga mode. I prefer to see the boot messages so no splash, and didn't have it blank, the screen just never left the legacy vga mode it normally uses for early boot. We're both on Radeons; he's on the old radeon driver while I'm on amdgpu (polaris-11, rx460). I wonder if you don't have the legacy vgacon (CONFIG_VGA_CONSOLE) enabled as a fall-back, as that would explain an apparent hang due to no valid graphics (tho the system may have booted, just without graphics). Alternatively, I don't know what the behavior of non-radeon/amdgpu drm-framebuffer drivers is, maybe whatever you're running either does hang or simply doesn't fall back to vgacon as our radeon and amdgpu drivers did? But in both his case and mine it bisected to the same commit, 69de4421bb, and reverting it against current gave both of us working systems again, so it's the same bug. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 214001] [bisected][regression] After commit "drm/ttm: Initialize debugfs from ttm_global_init()" kernels without debugfs explicitly set to 'allow all' fail to boot
https://bugzilla.kernel.org/show_bug.cgi?id=214001 --- Comment #3 from Linux_Chemist (untaintablean...@hotmail.co.uk) --- Thanks for your comment, Duncan! Yes, I'm on a customised kernel that has a lot removed (including debugfs as you can tell) and also amdgpu (RX 5700). There's usually a bug in a testing RC every few releases, I just report them here after bisecting; seems the right place for it even if it's not lol Caught a nice bug last release cycle with the memory reservation for the bios (https://www.phoronix.com/scan.php?page=news_item&px=Linux-Always-Reserve-1MB) (I wasn't sure to file this one under an AMD ("non-intel") specific 'video' bug but the commit was for 'drivers/gpu/drm/ttm' which I assume is agnostic. I don't know what it's for or whether only amdgpu/radeon makes use of it to say but it is interesting that the 3 of us have similar hardware.) I can confirm all my .configs have had CONFIG_VGA_CONSOLE=y in it (though a lot of fallback stuff pulled out that probably stops me getting the legacy low-res VGA mode you mention, c'est la vie) But anyways as you say, the ability to create a bootable kernel only becomes an issue from the commit in question when not having CONFIG_DEBUG_FS=y (and CONFIG_DEBUG_FS_ALLOW_ALL=y along with that) Don't get me wrong, it's not a showstopper 'massive bug' because you can always put debugfs + 'allow all' into your kernel, I did so and am happily on rc5 now, but that's why I'd like a consensus to be known or shared (i.e. change the wording for the kconfig options) about whether a lot of things are expecting debugfs to be there in some form now - is it now an 'essential' part of the kernel? Or should things that rely on it fail gracefully if they don't find it? Either it's essential and this isn't a bug and there needs to be clarification that debugfs should always be there in some form, or this is a bug and the commit needs tweaked to account for debugfs not being there or there in a diminished capacity. It is a bit silly that even CONFIG_DEBUG_FS_ALLOW_NONE wouldn't work for this bug because that seems like it should be providing that 'fail gracefully' mechanism to debugfs being 'there' but 'don't bother with it'. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 214001] [bisected][regression] After commit "drm/ttm: Initialize debugfs from ttm_global_init()" kernels without debugfs explicitly set to 'allow all' fail to boot
https://bugzilla.kernel.org/show_bug.cgi?id=214001 --- Comment #4 from Duncan (1i5t5.dun...@cox.net) --- (In reply to Linux_Chemist from comment #3) > Thanks for your comment, Duncan! > Yes, I'm on a customised kernel that has a lot removed (including debugfs as > you can tell) and also amdgpu (RX 5700). So amdgpu too, but a lot newer than my rx460. > There's usually a bug in a testing RC every few releases, I just report them > here after bisecting; seems the right place for it even if it's not lol That's what I had done previously. But it looks like the kernel bugzilla folks updated the kernel-specific instructions recently, and I decided to try to follow them. Not that it seems to have done a lot of good. At least in bugzilla it's easier for other users (not on whatever list) to see it, which makes a difference if it's something that can't be fixed immediately, due either to no response (as here) or to complications like bisect difficulty or inability to directly revert due to commit dependencies, both of which complicated the last amdgpu bug I filed (against 5.7, bug #207383). > (https://www.phoronix.com/scan.php?page=news_item&px=Linux-Always-Reserve- > 1MB) I remember reading about that and agreeing with the 1MiB reserved idea in general, tho the bug triggering the proposal didn't directly affect me. > > (I wasn't sure to file this one under an AMD ("non-intel") specific 'video' > bug but the commit was for 'drivers/gpu/drm/ttm' which I assume is agnostic. > I don't know what it's for or whether only amdgpu/radeon makes use of it to > say but it is interesting that the 3 of us have similar hardware.) All three of us users seemed to consider it generic drm, as I know ttm in general is, but you're correct, the fact that we're all on some form of radeon looks like it could be more than coincidence. The on-list reporter and I, was interesting, but now it's three, and getting more than interesting. FWIW the #207383 bug I mentioned bisected to a commit in Andrew Morton's mm tree, but only affected amdgpu (and radeon? IDR) because of something only amdgpu was doing. I filed that against amdgpu but only because I hadn't bisected yet when I first filed it, and had no clue it was going to bisect to an mm tree commit. And *OH* *MAN* was that thing hard to bisect, in part because I could count on my bisect-bad results but could never be entirely sure about bisect-good. By the time I finally finished the bisect, a number of others were CCed as they were seeing the bug too, some reproducing a lot more reliably than I was, but no one was really helping with the bisecting! OTOH, all the others /were/ an encouragement to keep going, a good thing really, because as hard as that thing was to bisect, if it hadn't been for them I'd have been *seriously* tempted to simply buy a new graphics card and be done with it! I seriously hope I never have any any bugs so difficult to bisect ever again! > I can confirm all my .configs have had CONFIG_VGA_CONSOLE=y in it Hmm. I really suspected that was the problem. I guess not. And amdgpu as well. New theory would be that the behavior's different on your much newer hardware. Not that it's likely to help with the bug but I'd be interested in whether it works with /only/ vgacon, no drm configured at all. Of course that'd be a CLI-only test, booting to text login, but if you've never tried it on that hardware it might be useful to know if vgacon works at all for you. (Similarly here, the last time I had really tested vgacon was before the switch from ums to kms. I knew it was /supposed/ to be a fallback, but for the last year or so the question had been nagging at me as to whether the fallback would actually work if the graphics card failed and I had to switch, since I'm doing a monolithic kernel with only the specific firmware for this card builtin, so if the vgacon fallback didn't work I'd be in trouble as in that case I couldn't get a text login to configure the new drivers and firmware and rebuild. So this failure was actually a relief for me, as it demonstrated that the fallback /does/ still work, should I ever need it.) > But anyways as you say, the ability to create a bootable kernel only becomes > an issue from the commit in question when not having CONFIG_DEBUG_FS=y (and > CONFIG_DEBUG_FS_ALLOW_ALL=y along with that) I think you said that. I was at least getting the fallback, and I guess like you before this bug, I didn't even have a clue that the three secondary choices once CONFIG_DEBUG_FS was enabled were there. Reading it here was new to me! =:^) > Don't get me wrong, it's not a showstopper 'massive bug' because you can > always put debugfs + 'allow all' into your kernel, I did so and am happily > on rc5 now I, and I think the guy who reported it to the list, reverted the commit in question, instead. Here, I do that by doing a git show --reverse redirected to a patch-file, then drop that patch-file in a particular directory where it
[Bug 213391] AMDGPU retries page fault with some specific processes amdgpu and sometimes followed [gfxhub0] retry page fault until *ERROR* ring gfx timeout, but soft recovered
https://bugzilla.kernel.org/show_bug.cgi?id=213391 mcmar...@gmx.net changed: What|Removed |Added CC||mcmar...@gmx.net --- Comment #35 from mcmar...@gmx.net --- i have a Lenovo L340 and the same problem here is the complete dmesg log https://gist.github.com/McMarius11/36c8d21a2dcaf5c2289c91a74af4f7fb Operating System: Manjaro Linux KDE Plasma Version: 5.22.4 KDE Frameworks Version: 5.84.0 Qt Version: 5.15.2 Kernel Version: 5.11.22-2-MANJARO (64-bit) Graphics Platform: X11 Processors: 8 × AMD Ryzen 7 3700U with Radeon Vega Mobile Gfx Memory: 5,6 GiB of RAM Graphics Processor: AMD Radeon™ Vega 10 Graphics -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 214071] New: amdgpu idle power draw to high at +75Hz
https://bugzilla.kernel.org/show_bug.cgi?id=214071 Bug ID: 214071 Summary: amdgpu idle power draw to high at +75Hz Product: Drivers Version: 2.5 Kernel Version: 5.13.10 Hardware: x86-64 OS: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: Video(DRI - non Intel) Assignee: drivers_video-...@kernel-bugs.osdl.org Reporter: p...@gmx.de Regression: No For best viewing pleasure I usually set my monitor to 144Hz and native 1080p. At that refresh rate my RX6900XT draws about 35 Watts in idle situations Memory clock stays at 1000Hz. I have to lower the monitors refresh rate to 75Hz, then the card draws only 8 Watts in idle and memory clock goes significantly down to 96MHz. Using 100 or 120 Hz does not help. Situation in windows is different. The same hardware setup works in Windows10 at 1080p@144 with idle power draw of just 8 Watts. So my guess is this is a driver issue and not a hardware issue. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 214071] amdgpu idle power draw too high at +75Hz
https://bugzilla.kernel.org/show_bug.cgi?id=214071 Paul Größel (p...@gmx.de) changed: What|Removed |Added Summary|amdgpu idle power draw to |amdgpu idle power draw too |high at +75Hz |high at +75Hz -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 214071] amdgpu idle power draw too high at +75Hz
https://bugzilla.kernel.org/show_bug.cgi?id=214071 --- Comment #1 from Paul Größel (p...@gmx.de) --- Hardware setup: Mainboard: MSI MPG B550I GAMING EDGE WIFI CPU: Ryzen 5950X GPU: Radeon RX 6900XT Kernel 5.13.10 mesa 21.1.4 X11 -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 205089] amdgpu : drm:amdgpu_cs_ioctl : Failed to initialize parser -125
https://bugzilla.kernel.org/show_bug.cgi?id=205089 ctjans...@protonmail.com changed: What|Removed |Added CC||ctjans...@protonmail.com --- Comment #19 from ctjans...@protonmail.com --- I just triggered this bug aswell playing Payday 2. I have also triggered this bug when playing World of Warcraft in june. OS: EndeavourOS Linux x86_64 Kernel: 5.13.10-arch1-1 Mesa: 21.1.6 DE: GNOME 40.3 CPU: Ryzen 9 5900X GPU: RX 6800 XT -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 211425] [drm:atom_op_jump] *ERROR* atombios stuck in loop for more than 20secs aborting
https://bugzilla.kernel.org/show_bug.cgi?id=211425 Andreas (icedragon...@web.de) changed: What|Removed |Added Kernel Version|5.13.6 |5.13.11 --- Comment #19 from Andreas (icedragon...@web.de) --- Still broken - status updated in the bug tracker header to current latest issued kernel version. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 204849] amdgpu (RX560X) traceboot in dmesg boot output, system instability
https://bugzilla.kernel.org/show_bug.cgi?id=204849 Justin Clift (jus...@postgresql.org) changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |OBSOLETE --- Comment #4 from Justin Clift (jus...@postgresql.org) --- Clearly no-one is ever going to look at this, so I'm just going to close it. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 201957] amdgpu: ring gfx timeout
https://bugzilla.kernel.org/show_bug.cgi?id=201957 --- Comment #48 from i-am-not-a-ro...@riseup.net --- This seems to be a firmware(-related) problem. After downgrading to linux firmware 2020-09-18, I'm running 6 days without a crash on the same work loads. (I was getting multiple crashes per day before). My GPU is Vega8 Mobile (ThinkPad A485). Currently running 5.13.11. An extensive discussion of different firmware versions in the context of a similar issue on Arch Forums: https://bbs.archlinux.org/viewtopic.php?id=266358&p=5 -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 214001] [bisected][regression] After commit "drm/ttm: Initialize debugfs from ttm_global_init()" kernels without debugfs explicitly set to 'allow all' fail to boot
https://bugzilla.kernel.org/show_bug.cgi?id=214001 --- Comment #5 from Linux_Chemist (untaintablean...@hotmail.co.uk) --- I can sense you're a smart cookie, Duncan, I've enjoyed this little tete a tete. I think this bug has been addressed, it's just not been mentioned yet (see the following into mainline): https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/gpu?id=958f44255058338f4b370d8e4100e1e7d72db0cc " This changes it so that if creation of TTM's debugfs root directory fails, then no biggie: keep calm and carry on." Will test it out as soon as I can and comment/adjust the bug report. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 214001] [bisected][regression] After commit "drm/ttm: Initialize debugfs from ttm_global_init()" kernels without debugfs explicitly set to 'allow all' fail to boot
https://bugzilla.kernel.org/show_bug.cgi?id=214001 Linux_Chemist (untaintablean...@hotmail.co.uk) changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |CODE_FIX --- Comment #6 from Linux_Chemist (untaintablean...@hotmail.co.uk) --- All good! CONFIG_DEBUG_FS is not set and can boot again :) Marking as fixed. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 208909] amdgpu Ryzen 7 4700U NULL pointer dereference multi monitor with rotation
https://bugzilla.kernel.org/show_bug.cgi?id=208909 ker...@890.at changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |CODE_FIX --- Comment #13 from ker...@890.at --- I have currently upgraded to 5.11.0-27-generic (hwe) and this problem seems to be fixed. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 211277] sometimes crash at s2ram-wake (Ryzen 3500U): amdgpu, drm, commit_tail, amdgpu_dm_atomic_commit_tail
https://bugzilla.kernel.org/show_bug.cgi?id=211277 --- Comment #37 from James Zhu (jam...@amd.com) --- HiJerome and kolAflash, would you mind base on your original test configuration,and add pci=noats in boot parameter? for example: linux /boot/vmlinuz-5.4.0-54-generic root=UUID=803844cc-7291-4056-bd04-f1b43b54ed97 ro pci=noats see if this helps. Thanks! James -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 205089] amdgpu : drm:amdgpu_cs_ioctl : Failed to initialize parser -125
https://bugzilla.kernel.org/show_bug.cgi?id=205089 --- Comment #20 from Alois Nespor (i...@aloisnespor.info) --- (In reply to Alois Nespor from comment #15) > i can confirm, have same problem now with Ryzen 5 3400G (RX Vega 11). > > kernel 5.13.4 and mesa 21.1.5 seems fixed with linux-firmware 20210818.c46b8c3 for me see https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=d7b50e61669dc137924337d03d09b8986eb752a3 they revert some fw due stability issues -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 211277] sometimes crash at s2ram-wake (Ryzen 3500U): amdgpu, drm, commit_tail, amdgpu_dm_atomic_commit_tail
https://bugzilla.kernel.org/show_bug.cgi?id=211277 --- Comment #38 from Jerome C (m...@jeromec.com) --- Hi James, With "pci=noats" set the suspension and resume works fine I did see some errors ( something about device not added ) in the kernel log from "kfd" but I guess that's related to PCIe ATS being disabled with the kernel parameter set Thanks Jerome On 21/02/2021 00:17, bugzilla-dae...@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=211277 > > --- Comment #9 from kolAflash (kolafl...@kolahilft.de) --- > I'm on Linux-5.7 now since 2021-01-26. > And I woke up the notebook at least once a day since then. > So it's clearly a regression in the kernel somewhere between 5.7 and 5.10 and > probably between 5.7 and 5.8. > > And it's definitely not a BIOS issue, because I changed anything about the > BIOS > since the problem appeared last time with Kernel-5.10. > > Regards, > kolAflash > > -- > You may reply to this email to add a comment. > > You are receiving this mail because: > You are on the CC list for the bug. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 211277] sometimes crash at s2ram-wake (Ryzen 3500U): amdgpu, drm, commit_tail, amdgpu_dm_atomic_commit_tail
https://bugzilla.kernel.org/show_bug.cgi?id=211277 --- Comment #40 from James Zhu (jam...@amd.com) --- Hi Jerome, Yes, you are right.Turning off ats will affect iommu. KFD needs iommu enable. KFD supports computing engine. It won't affect 3D and video acceleration. After I confirm if ats/iommu causes the issue, I will find right person to fix it. Thanks! James -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 214197] New: [Asus G713QY] RX6800M not usable after exiting Vulkan application
https://bugzilla.kernel.org/show_bug.cgi?id=214197 Bug ID: 214197 Summary: [Asus G713QY] RX6800M not usable after exiting Vulkan application Product: Drivers Version: 2.5 Kernel Version: 5.13.13 Hardware: x86-64 OS: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: Video(DRI - non Intel) Assignee: drivers_video-...@kernel-bugs.osdl.org Reporter: vele...@gmail.com Regression: No Asus ROG Strix G17 Advantage Edition (G713QY) has hybrid-graphics with dGPU RX6800M. After exiting any Vulkan application, it becomes unusable. Vulkaninfo sees dGPU before Vulkan app and does not see RX6800M after. After Vulkan app close, dmesg reports: [ 154.385749] amdgpu :03:00.0: amdgpu: RAS: optional ras ta ucode is not available [ 154.401405] amdgpu :03:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available [ 154.401409] amdgpu :03:00.0: amdgpu: SMU is resuming... [ 159.038150] amdgpu :03:00.0: amdgpu: message:RunDcBtc (54) param: 0x is timeout (no response) [ 159.038154] amdgpu :03:00.0: amdgpu: Failed to setup smc hw! [ 159.038156] [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block failed -62 [ 159.038220] amdgpu :03:00.0: amdgpu: amdgpu_device_ip_resume failed (-62). Using amdgpu.runpm=0 parameter fixes the issue. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 214199] New: Sapphire NITRO+ RX 580 4G G5 - Secondary display doesn't wake up on boot, both displays won't wake up from suspend
https://bugzilla.kernel.org/show_bug.cgi?id=214199 Bug ID: 214199 Summary: Sapphire NITRO+ RX 580 4G G5 - Secondary display doesn't wake up on boot, both displays won't wake up from suspend Product: Drivers Version: 2.5 Kernel Version: 5.13.12 Hardware: All OS: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: Video(DRI - non Intel) Assignee: drivers_video-...@kernel-bugs.osdl.org Reporter: ker...@zeljko.anonaddy.com Regression: No Created attachment 298497 --> https://bugzilla.kernel.org/attachment.cgi?id=298497&action=edit Boot dmesg Hi! This is the first time I've tried Linux with this graphics card. The issues are that my second monitor doesn't wake up from stand by on boot and also both monitors won't wake up from suspend. If I turn off the second monitor and turn it back on after boot it works fine. Same with both monitors after the computer wakes up from suspend. I have to turn off both of them and turn them back on. I'm attaching both boot and suspend dmesg logs with drm.debug=0xe kernel parameter from a fresh install of Arch Linux. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 214199] Sapphire NITRO+ RX 580 4G G5 - Secondary display doesn't wake up on boot, both displays won't wake up from suspend
https://bugzilla.kernel.org/show_bug.cgi?id=214199 --- Comment #1 from Zeljko (ker...@zeljko.anonaddy.com) --- Created attachment 298499 --> https://bugzilla.kernel.org/attachment.cgi?id=298499&action=edit Suspend dmesg -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 210321] /display/dc/dcn20/dcn20_resource.c:3240 dcn20_validate_bandwidth_fp+0x8b/0xd0 [amdgpu]
https://bugzilla.kernel.org/show_bug.cgi?id=210321 --- Comment #5 from Tristen Hayfield (tristen.hayfi...@gmail.com) --- I did some more digging into this. I put some logging inside the if block to see if that branch is ever taken: if (voltage_supported && dummy_pstate_supported) { context->bw_ctx.bw.dcn.clk.p_state_change_support = false; goto restore_dml_state; } in order to log when or if the fallback worked. The logs confirmed that the fallback is often used and generally works. Upon starting up the system and starting up Xorg I get about a dozen log messages indicating that it entered the if block. The only exception seems to be as Florian describes above, that when the display shuts off due to power-saving it triggers the assertion. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 212255] New: WARNING: at arch/x86/kernel/fpu/core.c:129 kernel_fpu_begin_mask
https://bugzilla.kernel.org/show_bug.cgi?id=212255 Bug ID: 212255 Summary: WARNING: at arch/x86/kernel/fpu/core.c:129 kernel_fpu_begin_mask Product: Drivers Version: 2.5 Kernel Version: 5.11.3 Hardware: x86-64 OS: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: Video(DRI - non Intel) Assignee: drivers_video-...@kernel-bugs.osdl.org Reporter: i...@felicetufo.com Regression: No Created attachment 295819 --> https://bugzilla.kernel.org/attachment.cgi?id=295819&action=edit dmesg Hello, i get two warnings booting kernel 5.11.3. The warning is present on later 5.11.x kernels (included 5.11.6) too, but does not show on 5.11.2. [ cut here ] [6.099356] WARNING: CPU: 6 PID: 366 at arch/x86/kernel/fpu/core.c:129 kernel_fpu_begin_mask+0xa3/0xb0 ---[ end trace 4cbc711ff0b0578b ]--- [6.102354] [ cut here ] [6.102354] WARNING: CPU: 6 PID: 366 at arch/x86/kernel/fpu/core.c:155 kernel_fpu_end+0x1e/0x30 -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 212255] WARNING: at arch/x86/kernel/fpu/core.c:129 kernel_fpu_begin_mask
https://bugzilla.kernel.org/show_bug.cgi?id=212255 Alex Deucher (alexdeuc...@gmail.com) changed: What|Removed |Added CC||alexdeuc...@gmail.com --- Comment #1 from Alex Deucher (alexdeuc...@gmail.com) --- Possibly fixed with these patches? https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=15e8b95d5f7509e0b09289be8c422c459c9f0412 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=680174cfd1e1cea70a8f30ccb44d8fbdf996018e -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 212255] WARNING: at arch/x86/kernel/fpu/core.c:129 kernel_fpu_begin_mask
https://bugzilla.kernel.org/show_bug.cgi?id=212255 --- Comment #2 from Felice Tufo (i...@felicetufo.com) --- Thanks Alex, it seems that Linus merged those patches (just today) for the next -rc release, am I right? If so, I'll do a quick test and let you know as soon as Ubuntu team will release the next mainline kernel (usually just 1 or 2 days after -rc is out). -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel