Strange. I tested X with ~120 glxgears instances which got killed and restarted every 60-120 seconds for the whole night, but without any lockup or freeze.
What's the kernel backtrace when this happens? If I understand you correctly X is killable in that situation, is that right? Please try the following: echo 1 > /sys/kernel/debug/tracing/events/radeon/radeon_fence_wait_begin/enable echo 1 > /sys/kernel/debug/tracing/events/radeon/radeon_fence_wait_end/enable before starting X. And when X freezed "cat /sys/kernel/debug/tracing/trace". Thanks, Christian. Am 15.10.2013 12:57, schrieb Marek Ol??k: > They are not lockups. X just freezes in GEM_WAIT. The only way to > reproduce it is to apply the patches, use the computer and wait. It > looks like a fence is not signalled and the process calling GEM_WAIT > is not woken up. > > Marek > > On Tue, Oct 15, 2013 at 11:11 AM, Christian K?nig > <deathsimple at vodafone.de> wrote: >> Mhm hard to say what's going wrong this time, but we probably need to fix it >> before the final release. >> >> Do you have a kernel backtrace from the lockups? Or at least some way to >> reproduce it? >> >> Christian. >> >> Am 14.10.2013 21:34, schrieb Marek Ol??k: >> >>> Ooops, the new problem is not so rare. It has now happened to me 3 >>> times in an hour. >>> >>> Marek >>> >>> On Mon, Oct 14, 2013 at 9:13 PM, Marek Ol??k <maraeo at gmail.com> wrote: >>>> I tested this and had over 1546 lockups followed by a successful GPU >>>> reset. Then the kernel probably crashed (judging by the fact ssh was >>>> dead). Still, it's pretty impressive. >>>> >>>> There is a new problem though. The X server sometimes gets stuck in >>>> GEM_WAIT and waits forever, even if there were no lockups before. It >>>> occurs very rarely though. I didn't see this issue without your >>>> patches. >>>> >>>> Marek >>>> >>>> On Mon, Oct 14, 2013 at 11:32 AM, Christian K?nig >>>> <deathsimple at vodafone.de> wrote: >>>>> From: Christian K?nig <christian.koenig at amd.com> >>>>> >>>>> Stop leaking IB memory and scratch register space when the test fails. >>>>> >>>>> Signed-off-by: Christian K?nig <christian.koenig at amd.com> >>>>> --- >>>>> drivers/gpu/drm/radeon/cik.c | 3 +++ >>>>> 1 file changed, 3 insertions(+) >>>>> >>>>> diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c >>>>> index b874ccd..8f393df 100644 >>>>> --- a/drivers/gpu/drm/radeon/cik.c >>>>> +++ b/drivers/gpu/drm/radeon/cik.c >>>>> @@ -3182,6 +3182,7 @@ int cik_ib_test(struct radeon_device *rdev, struct >>>>> radeon_ring *ring) >>>>> r = radeon_ib_get(rdev, ring->idx, &ib, NULL, 256); >>>>> if (r) { >>>>> DRM_ERROR("radeon: failed to get ib (%d).\n", r); >>>>> + radeon_scratch_free(rdev, scratch); >>>>> return r; >>>>> } >>>>> ib.ptr[0] = PACKET3(PACKET3_SET_UCONFIG_REG, 1); >>>>> @@ -3198,6 +3199,8 @@ int cik_ib_test(struct radeon_device *rdev, struct >>>>> radeon_ring *ring) >>>>> r = radeon_fence_wait(ib.fence, false); >>>>> if (r) { >>>>> DRM_ERROR("radeon: fence wait failed (%d).\n", r); >>>>> + radeon_scratch_free(rdev, scratch); >>>>> + radeon_ib_free(rdev, &ib); >>>>> return r; >>>>> } >>>>> for (i = 0; i < rdev->usec_timeout; i++) { >>>>> -- >>>>> 1.8.1.2 >>>>> >>>>> _______________________________________________ >>>>> dri-devel mailing list >>>>> dri-devel at lists.freedesktop.org >>>>> http://lists.freedesktop.org/mailman/listinfo/dri-devel >>