Dale Schroeder <d...@briannassaladdressing.com> writes: > I've been able to narrow this down a bit more. One of my other > systems still had 3.2.0-3-686-pae (3.2.23-1) in its apt archives. This > kernel also boots successfully and does not hang at mdm startup. So, > the problem did not exist in any of the 3.2.0-3 series released to > Wheezy, but was introduced sometime between this image and 3.2.32-1, > the 1st of the 3.2.0-4 series released to Wheezy.
I note that the changelog for linux (3.2.29-1) unstable; urgency=low includes * [x86] drm/i915: Fix i8xx interrupt handling (Closes: #655152) which is extremely suspiscious in this context. I wonder if anyone experiencing this bug has tried reverting this patch?: http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=59;filename=drm-i915-i8xx-interrupt-handler.patch;att=1;bug=655152 Note that it is another shot in the dark - I have absolutely no idea what's going on here. But I have a feeling that patch is replacing an annoying bug on one platform with a critical bug on another. Not sure that is a good tradeoff... The debug output kindly provided by Сергей in http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=36;filename=dmesg.txt;att=4;bug=692607 shows (don't know why I included &work->entry - that's pointless): [ 59.548011] work->data=0x00000000, &work->entry=de33c56c, work->entry.next= (null), work->entry.prev= (null) which clearly tells us that the problem is related to i915_handle_error() calling queue_work(dev_priv->wq, &dev_priv->error_work); with an uninitialized error_work. As noted earlier, this is supposed to be initialized in intel_irq_init() so either that has not happend (yet?) or something has zeroed it out later. I am putting a beer on the first alternative. Right. It's even bloody obvious (no that you all have pointed to the releases surrounding the 655152 bugfix). That patch adds this i8xx specific function: +static void i8xx_irq_preinstall(struct drm_device * dev) +{ + drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private; + int pipe; + + atomic_set(&dev_priv->irq_received, 0); + + for_each_pipe(pipe) + I915_WRITE(PIPESTAT(pipe), 0); + I915_WRITE16(IMR, 0xffff); + I915_WRITE16(IER, 0x0); + POSTING_READ16(IER); +} replacing this for all chips matching "(INTEL_INFO(dev)->gen == 2)": static void i915_driver_irq_preinstall(struct drm_device * dev) { drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private; int pipe; atomic_set(&dev_priv->irq_received, 0); INIT_WORK(&dev_priv->hotplug_work, i915_hotplug_work_func); INIT_WORK(&dev_priv->error_work, i915_error_work_func); if (I915_HAS_HOTPLUG(dev)) { I915_WRITE(PORT_HOTPLUG_EN, 0); I915_WRITE(PORT_HOTPLUG_STAT, I915_READ(PORT_HOTPLUG_STAT)); } I915_WRITE(HWSTAM, 0xeffe); for_each_pipe(pipe) I915_WRITE(PIPESTAT(pipe), 0); I915_WRITE(IMR, 0xffffffff); I915_WRITE(IER, 0x0); POSTING_READ(IER); } Anyone able to spot the missing INIT_WORK()'s? Based on the I915_HAS_HOTPLUG(dev) test, I assume that leaving the first one out was intentional. But the second one cannot be left out, as demonstrated by these bug reports. I am attaching a proposed fix on top of the 655152 patch, which I have not tested at all on actual Debian kernel sources. Might need context adjustments. I'd appreciate it if anyone with crashing hardware could test it. Bjørn
>From d2451aff41d2db6047586c22317cd247e4c000ca Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Bj=C3=B8rn=20Mork?= <bj...@mork.no> Date: Thu, 28 Feb 2013 11:26:20 +0100 Subject: [PATCH] drm/i915: initialize error_work for i8xx interrupt handler MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The backport of upstream commit c2798b19bac2538393fc932bfbe59807a4734b3e failed to initialize the error_work struct for gen2 hardware, resulting in hitting a BUG in kernel/workqueue.c if/when the interrupt handler tried to queue error handling work. Signed-off-by: Bjørn Mork <bj...@mork.no> --- drivers/gpu/drm/i915/i915_irq.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 47b08ce..bb9b943 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -2102,6 +2102,8 @@ static void i8xx_irq_preinstall(struct drm_device * dev) atomic_set(&dev_priv->irq_received, 0); + INIT_WORK(&dev_priv->error_work, i915_error_work_func); + for_each_pipe(pipe) I915_WRITE(PIPESTAT(pipe), 0); I915_WRITE16(IMR, 0xffff); -- 1.7.10.4