https://bugzilla.kernel.org/show_bug.cgi?id=77001
Bug ID: 77001 Summary: Radeon R9 270X GPU lockup and resume failure after all night inactivity Product: Drivers Version: 2.5 Kernel Version: 3.14.4 Hardware: x86-64 OS: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: Video(DRI - non Intel) Assignee: drivers_video-dri at kernel-bugs.osdl.org Reporter: custos.mentis at gmail.com Regression: No Created attachment 137581 --> https://bugzilla.kernel.org/attachment.cgi?id=137581&action=edit kernel log with the lockup and following boot messages After leaving the computer on during the night it hung up in the morning, while I tried to use it again, with the following message: ------ May 28 06:13:06 [kernel] [153149.666146] radeon 0000:01:00.0: GPU lockup CP stall for more than 10033msec May 28 06:13:06 [kernel] [153149.666150] radeon 0000:01:00.0: GPU lockup (waiting for 0x000000000071debe last fence id 0x000000000071debd on ring 0) May 28 06:13:06 [kernel] [153150.122657] radeon 0000:01:00.0: GPU lockup CP stall for more than 10490msec May 28 06:13:06 [kernel] [153150.122661] radeon 0000:01:00.0: GPU lockup (waiting for 0x000000000071debe last fence id 0x000000000071debd on ring 0) May 28 06:13:06 [kernel] [153150.122664] radeon 0000:01:00.0: failed to get a new IB (-35) May 28 06:13:06 [kernel] [153150.124014] radeon 0000:01:00.0: sa_manager is not empty, clearing anyway May 28 06:13:07 [kernel] [153150.927575] radeon 0000:01:00.0: Saved 3296 dwords of commands on ring 0. May 28 06:13:07 [kernel] [153150.927713] radeon 0000:01:00.0: GPU softreset: 0x0000004D May 28 06:13:07 [kernel] [153150.927721] radeon 0000:01:00.0: GRBM_STATUS = 0xA3503028 May 28 06:13:07 [kernel] [153150.927726] radeon 0000:01:00.0: GRBM_STATUS_SE0 = 0x28000006 May 28 06:13:07 [kernel] [153150.927732] radeon 0000:01:00.0: GRBM_STATUS_SE1 = 0x2D000006 May 28 06:13:07 [kernel] [153150.927736] radeon 0000:01:00.0: SRBM_STATUS = 0x20000EC0 May 28 06:13:07 [kernel] [153150.927850] radeon 0000:01:00.0: SRBM_STATUS2 = 0x00000000 May 28 06:13:07 [kernel] [153150.927854] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 May 28 06:13:07 [kernel] [153150.927859] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00004100 May 28 06:13:07 [kernel] [153150.927863] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00028986 May 28 06:13:07 [kernel] [153150.927867] radeon 0000:01:00.0: R_008680_CP_STAT = 0x800282E7 May 28 06:13:07 [kernel] [153150.927874] radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44483146 May 28 06:13:07 [kernel] [153150.927878] radeon 0000:01:00.0: R_00D834_DMA_STATUS_REG = 0x44C83D57 May 28 06:13:07 [kernel] [153150.927883] radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00000000 May 28 06:13:07 [kernel] [153150.927887] radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000 May 28 06:13:08 [kernel] [153152.248694] radeon 0000:01:00.0: Wait for MC idle timedout ! May 28 06:13:08 [kernel] [153152.248703] radeon 0000:01:00.0: GRBM_SOFT_RESET=0x0000DDFF May 28 06:13:08 [kernel] [153152.248760] radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00100100 May 28 06:13:08 [kernel] [153152.249931] radeon 0000:01:00.0: GRBM_STATUS = 0x00003028 May 28 06:13:08 [kernel] [153152.249937] radeon 0000:01:00.0: GRBM_STATUS_SE0 = 0x00000006 May 28 06:13:08 [kernel] [153152.249941] radeon 0000:01:00.0: GRBM_STATUS_SE1 = 0x00000006 May 28 06:13:08 [kernel] [153152.249945] radeon 0000:01:00.0: SRBM_STATUS = 0x20000EC0 May 28 06:13:08 [kernel] [153152.250059] radeon 0000:01:00.0: SRBM_STATUS2 = 0x00000000 May 28 06:13:08 [kernel] [153152.250088] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 May 28 06:13:08 [kernel] [153152.250092] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00000000 May 28 06:13:08 [kernel] [153152.250099] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00000000 May 28 06:13:08 [kernel] [153152.250109] radeon 0000:01:00.0: R_008680_CP_STAT = 0x00000000 May 28 06:13:08 [kernel] [153152.250114] radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 May 28 06:13:08 [kernel] [153152.250118] radeon 0000:01:00.0: R_00D834_DMA_STATUS_REG = 0x44C83D57 May 28 06:13:08 [kernel] [153152.250372] radeon 0000:01:00.0: GPU reset succeeded, trying to resume May 28 06:13:13 [kernel] [153157.261944] [drm:atom_op_jump] *ERROR* atombios stuck in loop for more than 5secs aborting May 28 06:13:13 [kernel] [153157.261952] [drm:atom_execute_table_locked] *ERROR* atombios stuck executing C008 (len 254, WS 0, PS 4) @ 0xC032 May 28 06:13:13 [kernel] [153157.261957] [drm:atom_execute_table_locked] *ERROR* atombios stuck executing B67A (len 94, WS 12, PS 8) @ 0xB6C3 May 28 06:13:13 [kernel] [153157.276909] [drm] probing gen 2 caps for device 1002:5a16 = 33ed02/0 May 28 06:13:13 [kernel] [153157.276918] [drm] PCIE gen 2 link speeds already enabled May 28 06:13:14 [kernel] [153157.761994] radeon 0000:01:00.0: Wait for MC idle timedout ! May 28 06:13:14 [kernel] [153157.982771] radeon 0000:01:00.0: Wait for MC idle timedout ! May 28 06:13:14 [kernel] [153157.988649] [drm] PCIE GART of 1024M enabled (table at 0x0000000000276000). May 28 06:13:14 [kernel] [153157.988769] radeon 0000:01:00.0: WB enabled May 28 06:13:14 [kernel] [153157.988775] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000080000c00 and cpu addr 0xffff8800ba368c00 May 28 06:13:14 [kernel] [153157.988780] radeon 0000:01:00.0: fence driver on ring 1 use gpu addr 0x0000000080000c04 and cpu addr 0xffff8800ba368c04 May 28 06:13:14 [kernel] [153157.988785] radeon 0000:01:00.0: fence driver on ring 2 use gpu addr 0x0000000080000c08 and cpu addr 0xffff8800ba368c08 May 28 06:13:14 [kernel] [153157.988792] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000080000c0c and cpu addr 0xffff8800ba368c0c May 28 06:13:14 [kernel] [153157.988798] radeon 0000:01:00.0: fence driver on ring 4 use gpu addr 0x0000000080000c10 and cpu addr 0xffff8800ba368c10 May 28 06:13:14 [kernel] [153157.989912] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x0000000000075a18 and cpu addr 0xffffc90010335a18 May 28 06:13:15 [kernel] [153158.516551] [drm:r600_ring_test] *ERROR* radeon: ring 0 test failed (scratch(0x850C)=0xCAFEDEAD) May 28 06:13:15 [kernel] [153158.516557] [drm:si_resume] *ERROR* si startup failed on resume May 28 06:13:15 [kernel] [153158.516629] [drm:radeon_pm_resume_dpm] *ERROR* radeon: dpm resume failed May 28 06:13:16 [kernel] [153159.392111] radeon 0000:01:00.0: Saved 9824 dwords of commands on ring 0. May 28 06:13:16 [kernel] [153159.392244] radeon 0000:01:00.0: GPU softreset: 0x00000048 May 28 06:13:16 [kernel] [153159.392246] radeon 0000:01:00.0: GRBM_STATUS = 0xA0003028 May 28 06:13:16 [kernel] [153159.392249] radeon 0000:01:00.0: GRBM_STATUS_SE0 = 0x00000006 May 28 06:13:16 [kernel] [153159.392251] radeon 0000:01:00.0: GRBM_STATUS_SE1 = 0x00000006 May 28 06:13:16 [kernel] [153159.392253] radeon 0000:01:00.0: SRBM_STATUS = 0x20000EC0 May 28 06:13:16 [kernel] [153159.392363] radeon 0000:01:00.0: SRBM_STATUS2 = 0x00000000 May 28 06:13:16 [kernel] [153159.392365] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 May 28 06:13:16 [kernel] [153159.392367] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00010100 May 28 06:13:16 [kernel] [153159.392369] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00420182 May 28 06:13:16 [kernel] [153159.392371] radeon 0000:01:00.0: R_008680_CP_STAT = 0x84038243 May 28 06:13:16 [kernel] [153159.392375] radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 May 28 06:13:16 [kernel] [153159.392377] radeon 0000:01:00.0: R_00D834_DMA_STATUS_REG = 0x44C83D57 May 28 06:13:16 [kernel] [153159.392380] radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00000000 May 28 06:13:16 [kernel] [153159.392383] radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000 May 28 06:13:17 [kernel] [153160.661846] radeon 0000:01:00.0: Wait for MC idle timedout ! May 28 06:13:17 [kernel] [153160.661850] radeon 0000:01:00.0: GRBM_SOFT_RESET=0x0000DDFF May 28 06:13:17 [kernel] [153160.661904] radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00000100 May 28 06:13:17 [kernel] [153160.663060] radeon 0000:01:00.0: GRBM_STATUS = 0x00003028 May 28 06:13:17 [kernel] [153160.663062] radeon 0000:01:00.0: GRBM_STATUS_SE0 = 0x00000006 May 28 06:13:17 [kernel] [153160.663067] radeon 0000:01:00.0: GRBM_STATUS_SE1 = 0x00000006 May 28 06:13:17 [kernel] [153160.663075] radeon 0000:01:00.0: SRBM_STATUS = 0x20000EC0 May 28 06:13:17 [kernel] [153160.663186] radeon 0000:01:00.0: SRBM_STATUS2 = 0x00000000 May 28 06:13:17 [kernel] [153160.663193] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 May 28 06:13:17 [kernel] [153160.663197] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00000000 May 28 06:13:17 [kernel] [153160.663199] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00000000 May 28 06:13:17 [kernel] [153160.663200] radeon 0000:01:00.0: R_008680_CP_STAT = 0x00000000 May 28 06:13:17 [kernel] [153160.663207] radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 May 28 06:13:17 [kernel] [153160.663209] radeon 0000:01:00.0: R_00D834_DMA_STATUS_REG = 0x44C83D57 May 28 06:13:17 [kernel] [153160.663478] radeon 0000:01:00.0: GPU reset succeeded, trying to resume ------ After that it didn't respond anymore, not even through ssh, so hard reset was required. I've noticed that simply pressing the reset button is not enough, and hard reset with powering off the computer is necessary. -- You are receiving this mail because: You are watching the assignee of the bug.