Op 03-09-13 16:20, Pasi Kärkkäinen schreef: > On Wed, Aug 28, 2013 at 09:44:15AM +0200, Maarten Lankhorst wrote: >> Op 28-08-13 08:29, Pasi Kärkkäinen schreef: >>> On Fri, Aug 23, 2013 at 11:20:42PM +0300, Pasi Kärkkäinen wrote: >>>> On Thu, Aug 22, 2013 at 09:12:40AM +0200, Maarten Lankhorst wrote: >>>>> Op 22-08-13 02:10, Ilia Mirkin schreef: >>>>>> The code expects non-VRAM mem nodes to have a pages list. If that's not >>>>>> set, it will do a null deref down the line. Warn on that condition and >>>>>> return an error. >>>>>> >>>>>> See https://bugs.freedesktop.org/show_bug.cgi?id=64774 >>>>>> >>>>>> Reported-by: Pasi Kärkkäinen <pa...@iki.fi> >>>>>> Tested-by: Pasi Kärkkäinen <pa...@iki.fi> >>>>>> Signed-off-by: Ilia Mirkin <imir...@alum.mit.edu> >>>>>> Cc: <sta...@vger.kernel.org> # 3.8+ >>>>>> --- >>>>>> >>>>>> I don't exactly understand what's going on, but this is just a >>>>>> straightforward way to avoid a null deref that you see happens in the >>>>>> bug. I haven't figured out the root cause of this, but it's getting >>>>>> well into the "I have no idea how TTM works" space. However this seems >>>>>> like a bit of defensive programming -- nouveau_vm_map_sg will pass >>>>>> node->pages as a list down, which will be dereferenced by >>>>>> nvc0_vm_map_sg. Perhaps the other arguments should make that >>>>>> dereferencing not happen, but it definitely was happening here, as you >>>>>> can see in the bug. >>>>>> >>>>>> Ben/Maarten, I'll let you judge whether this check is appropriate, >>>>>> since like I hope I was able to convey above, I'm just not really sure :) >>>>> Not it really isn't appropriate.. >>>>> >>>>> You'd have to call call nouveau_vm_map_sg_table instead, the only place >>>>> that doesn't handle that correctly >>>>> is where it's not expected to be called. >>>>> >>>>> Here, have a completely untested patch to fix things... >>>>> >>>> Maarten: I've been testing this stuff with Linux 3.10.x, so I had to >>>> modify the patch a bit to make it apply there.. >>>> I've attached the modified copy that applies to 3.10.9, hopefully I did >>>> the backport correctly. >>>> >>>> With Linux 3.10.9 and the patch applied the kernel doesn't crash anymore, >>>> and I get this error in dmesg: >>>> >>>> [ 76.105643] nouveau W[ DRM] Trying to create a fb in vram with >>>> valid_domains=00000004 >>>> >>>> Does that help? >>>> >>> Any comments? >>> >>> Maarten's patch works for me, I get that warning instead of a kernel crash, >>> but it's also a bigger change that doesn't apply to older kernels as-is. >>> >>> Ilia's original patch in this thread can be applied to older kernels as-is, >>> and it also prevents the kernel crash from happening. >>> >>> Should we get both applied, so Ilia's patch can be CC'd to stable ? >>> >> You haven't understood the root cause then. You're trying to move an >> IMPORTED dma-buf into VRAM. >> Doing so would seem to work, but at that point it's no longer a dma-buf so >> changes done by the exporter >> would not show up in nouveau because they no longer refer to the same piece >> of memory. >> > Yes, that's true, I don't understand the root cause. > That's mostly because I'm not familiar with the nouveau code/internals. > > >> Failing is the only right option here. >> > Sorry but I'm not sure I understand that correctly.. what exactly do you > suggest? What should fail? > > > Because I'm not familiar with the code (yet) understanding and finding the > root cause > will take time for me.. that's why I was suggesting to meanwhile apply Ilia's > very simple patch > from this thread which actually prevents the hard kernel crash from > happening, because it seems > like an unharmful fix to have to protect against this kind of bugs (the root > cause) ? > Or is that unacceptable? > > (To recap: The kernel crash happens very often when trying to use the nouveau > adapter in Optimus mode, with all kernels at least 3.8+, and it's very > annoying that your laptop crashes when trying to enable a nouveau output. So > Ilia's patch doesn't make the driver working properly, but at least it > prevents the hard kernel crash from happening. The crash bug is in many > kernel versions, including the long-term v3.10 tree. I've had the crash > happening with 3.8.x, 3.9.x and 3.10.x) The fix probably requires commit fdfb8332651db7a280851dfccfc4f0cff4bcd052 to apply cleanly "drm/nouveau: fix some error-path leaks in fbcon handling code"
> All comments welcome. > > Thanks, > > -- Pasi > _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel