https://bugs.freedesktop.org/show_bug.cgi?id=59649
Priority: medium Bug ID: 59649 Assignee: dri-devel at lists.freedesktop.org Summary: [r600][RV635] GPU lockup CP stall / GPU resets over and over - Kernel 3.7, 3.8-rcX Severity: major Classification: Unclassified OS: Linux (All) Reporter: shawn.starr at rogers.com Hardware: x86-64 (AMD64) Status: NEW Version: 9.0 Component: Drivers/Gallium/r600 Product: Mesa Using Linux kernel 3.7 and up to 3.8-rc3 Unable to have a stable session with my RV635 GPU Jan 19 03:45:26 segfault kernel: [15008.313696] radeon 0000:01:00.0: Saved 185 dwords of commands on ring 0. Jan 19 03:45:26 segfault kernel: [15008.313704] radeon 0000:01:00.0: GPU softreset Jan 19 03:45:26 segfault kernel: [15008.313711] radeon 0000:01:00.0: R_008010_GRBM_STATUS=0xA0003030 Jan 19 03:45:26 segfault kernel: [15008.313717] radeon 0000:01:00.0: R_008014_GRBM_STATUS2=0x00000003 Jan 19 03:45:26 segfault kernel: [15008.313723] radeon 0000:01:00.0: R_000E50_SRBM_STATUS=0x200000C0 Jan 19 03:45:26 segfault kernel: [15008.313730] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 Jan 19 03:45:26 segfault kernel: [15008.313736] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00000000 Jan 19 03:45:26 segfault kernel: [15008.313742] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00000006 Jan 19 03:45:26 segfault kernel: [15008.313748] radeon 0000:01:00.0: R_008680_CP_STAT = 0x80000645 Jan 19 03:45:26 segfault kernel: [15008.313761] radeon 0000:01:00.0: R_008020_GRBM_SOFT_RESET=0x00007FEE Jan 19 03:45:26 segfault kernel: [15008.328772] radeon 0000:01:00.0: R_008020_GRBM_SOFT_RESET=0x00000001 Jan 19 03:45:26 segfault kernel: [15008.344782] radeon 0000:01:00.0: R_008010_GRBM_STATUS=0xA0003030 Jan 19 03:45:26 segfault kernel: [15008.344785] radeon 0000:01:00.0: R_008014_GRBM_STATUS2=0x00000003 Jan 19 03:45:26 segfault kernel: [15008.344787] radeon 0000:01:00.0: R_000E50_SRBM_STATUS=0x200080C0 Jan 19 03:45:26 segfault kernel: [15008.344789] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 Jan 19 03:45:26 segfault kernel: [15008.344792] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00000000 Jan 19 03:45:26 segfault kernel: [15008.344794] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00000000 Jan 19 03:45:26 segfault kernel: [15008.344797] radeon 0000:01:00.0: R_008680_CP_STAT = 0x80100000 Jan 19 03:45:26 segfault kernel: [15008.345799] radeon 0000:01:00.0: GPU reset succeeded, trying to resume Jan 19 03:45:26 segfault kernel: [15008.348414] [drm] probing gen 2 caps for device 8086:2a41 = 1/0 Jan 19 03:45:26 segfault kernel: [15008.350360] [drm] PCIE GART of 512M enabled (table at 0x0000000000040000). Jan 19 03:45:26 segfault kernel: [15008.350399] radeon 0000:01:00.0: WB enabled Jan 19 03:45:26 segfault kernel: [15008.350403] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000020000c00 and cpu addr 0xffff880229236c00 Jan 19 03:45:26 segfault kernel: [15008.381778] [drm] ring test on 0 succeeded in 1 usecs Jan 19 03:45:26 segfault kernel: [15008.384549] [drm] ib test on ring 0 succeeded in 0 usecs Jan 19 03:46:12 segfault kernel: [15053.625108] radeon 0000:01:00.0: GPU lockup CP stall for more than 10000msec ... Jan 19 03:46:12 segfault kernel: [15053.975428] radeon 0000:01:00.0: Wait for MC idle timedout ! Jan 19 03:46:12 segfault kernel: [15054.123890] radeon 0000:01:00.0: Wait for MC idle timedout ! Jan 19 03:46:12 segfault kernel: [15054.125748] [drm] PCIE GART of 512M enabled (table at 0x0000000000040000). Jan 19 03:46:12 segfault kernel: [15054.125785] radeon 0000:01:00.0: WB enabled Jan 19 03:46:12 segfault kernel: [15054.125789] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000020000c00 and cpu addr 0xffff880229236c00 Jan 19 03:46:12 segfault kernel: [15054.157608] [drm] ring test on 0 succeeded in 0 usecs Jan 19 03:46:23 segfault kernel: [15064.657103] radeon 0000:01:00.0: GPU lockup CP stall for more than 10000msec Jan 19 03:46:23 segfault kernel: [15064.657114] radeon 0000:01:00.0: GPU lockup (waiting for 0x00000000000441b6 last fence id 0x00000000000441a8) Jan 19 03:46:23 segfault kernel: [15064.657121] [drm:r600_ib_test] *ERROR* radeon: fence wait failed (-35). Jan 19 03:46:23 segfault kernel: [15064.657134] [drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on GFX ring (-35). Jan 19 03:46:23 segfault kernel: [15064.657140] radeon 0000:01:00.0: ib ring test failed (-35). Jan 19 03:46:23 segfault kernel: [15064.658211] radeon 0000:01:00.0: GPU softreset Jan 19 03:46:23 segfault kernel: [15064.658218] radeon 0000:01:00.0: R_008010_GRBM_STATUS=0xE57C24E0 Jan 19 03:46:23 segfault kernel: [15064.658224] radeon 0000:01:00.0: R_008014_GRBM_STATUS2=0x00113303 Jan 19 03:46:23 segfault kernel: [15064.658230] radeon 0000:01:00.0: R_000E50_SRBM_STATUS=0x200030C0 Jan 19 03:46:23 segfault kernel: [15064.658236] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x01000000 Jan 19 03:46:23 segfault kernel: [15064.658242] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00001002 Jan 19 03:46:23 segfault kernel: [15064.658248] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00028482 Jan 19 03:46:23 segfault kernel: [15064.658254] radeon 0000:01:00.0: R_008680_CP_STAT = 0x80838645 Jan 19 03:46:23 segfault kernel: [15064.829116] radeon 0000:01:00.0: Wait for MC idle timedout ! Jan 19 03:46:23 segfault kernel: [15064.829123] radeon 0000:01:00.0: R_008020_GRBM_SOFT_RESET=0x00007FEE Jan 19 03:46:23 segfault kernel: [15064.844133] radeon 0000:01:00.0: R_008020_GRBM_SOFT_RESET=0x00000001 Jan 19 03:46:23 segfault kernel: [15064.860144] radeon 0000:01:00.0: R_008010_GRBM_STATUS=0xA0003030 Jan 19 03:46:23 segfault kernel: [15064.860150] radeon 0000:01:00.0: R_008014_GRBM_STATUS2=0x00000003 an 19 03:46:23 segfault kernel: [15064.860163] radeon 0000:01:00.0: R_000E50_SRBM_STATUS=0x2000B0C0 Jan 19 03:46:23 segfault kernel: [15064.860169] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 Jan 19 03:46:23 segfault kernel: [15064.860175] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00000000 Jan 19 03:46:23 segfault kernel: [15064.860181] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00000000 Jan 19 03:46:23 segfault kernel: [15064.860191] radeon 0000:01:00.0: R_008680_CP_STAT = 0x80100000 Jan 19 03:46:23 segfault kernel: [15064.861197] radeon 0000:01:00.0: GPU reset succeeded, trying to resume Jan 19 04:39:23 segfault kernel: [ 2791.671107] [drm:r600_ib_test] *ERROR* radeon: fence wait failed (-35). Jan 19 04:39:23 segfault kernel: [ 2791.671115] [drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on GFX ring (-35). Then floods console with [drm:radeon_cs_ib_chunk] *ERROR* Failed to schedule IB ! radeon 0000:01:00.0: couldn't schedule ib (over and over) mesa-dri-drivers-9.0.1-3.fc18.x86_64 libdrm-2.4.40-1.fc18.x86_64 kernels: kernel-3.7.3-201.fc18.x86_64, kernel-devel-3.8.0-0.rc3.git1.2.fc19.x86_64 I have not tried on 3.8-rc4 yet Laptop: Lenovo ThinkPad W500 -- You are receiving this mail because: You are the assignee for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20130121/53209b3d/attachment-0001.html>