Re: i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)
On Fri, Mar 16, 2012 at 11:01 PM, Keith Packard wrote: > <#part sign=pgpmime> > On Fri, 16 Mar 2012 16:47:46 +, Dave Airlie wrote: > >> The hibernate issue is known and I've been hoping someone from Intel >> would run with debugging it, they have a big enough team that I don't >> feel I can expend the personal time to look into it. > > Yeah, I've been chatting with a couple of intel folks; we should have a > test patch ready shortly, but we haven't been able to reproduce anything > like this... > >> Maybe Keith can push someone or maybe I just refuse pull requests >> until one with a fix appears. > > From what I've seen, this is a problem only on Ironlake machines, which > makes me afraid of some weird GTT flushing issue, given the adventures > we had with VT-d on that hardware where we idle the gpu before any GTT > updates. I wonder what would happen if we idled the GPU before any GTT > updates even when VT-d wasn't running... We've had reports on 965GM in Fedora, maybe davej has the specific bug. Dave. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 45366] Radeon gpu lockups
https://bugs.freedesktop.org/show_bug.cgi?id=45366 --- Comment #5 from Ernst Sjöstrand 2012-03-17 03:39:47 PDT --- This might be related to reclocking the GPU because when I tried dynpm it happened more often and it often happens when waking the computer up from DPMS. -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 29412] fans running at full-speed after resume from suspend with radeon and KMS
https://bugzilla.kernel.org/show_bug.cgi?id=29412 --- Comment #4 from Jon Dowland 2012-03-17 11:27:26 --- Ok retesting v2.6.32-rc1 first, I had trouble building it (gcc 4.6 won't work); trouble getting the initramfs to work, and trouble getting X to start (some kind of race condition / bad failure mode in the gdm3 init script launched ~300 Xorg instances before I killed it). However, I could reproduce the issue without X: single-user mode, no radeon module loaded: fans climbed down as they should. Single-user mode, radeon driver loaded: fans didn't climb down. I'll check v2.6.32-rc1^ next, then I'll collect dmesg/Xorg.0.logs as appropriate. (I confirmed this with 3.3~rc6-1~experimental.1 in the mean time) -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 29412] fans running at full-speed after resume from suspend with radeon and KMS
https://bugzilla.kernel.org/show_bug.cgi?id=29412 --- Comment #5 from Jon Dowland 2012-03-17 11:49:13 --- Created an attachment (id=72633) --> (https://bugzilla.kernel.org/attachment.cgi?id=72633) dmesg from v2.6.32-rc1^ Frustratingly, the issue reared its head with v2.6.32-rc1^. This dmesg is from a single user mode session, two suspends with "modprobe radeon" in the middle. It's getting hard to run kernels this old (even though they aren't that old!) notice the error loading the radeon firmware: no idea why it hasn't managed, other kernels do (and it's there). Also udev refuses to start with < 2.6.32. I may have to set up a test root with an older userspace (squeeze perhaps) to go further :( Does this basically invalidate my previous bisect? -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH] Only build test programs with make check
Signed-off-by: Matt Turner --- tests/kmstest/Makefile.am |2 +- tests/modeprint/Makefile.am |2 +- tests/modetest/Makefile.am |2 +- tests/radeon/Makefile.am|2 +- tests/vbltest/Makefile.am |2 +- 5 files changed, 5 insertions(+), 5 deletions(-) diff --git a/tests/kmstest/Makefile.am b/tests/kmstest/Makefile.am index ae562a1..10b9ef3 100644 --- a/tests/kmstest/Makefile.am +++ b/tests/kmstest/Makefile.am @@ -3,7 +3,7 @@ AM_CFLAGS = \ -I$(top_srcdir)/libkms/ \ -I$(top_srcdir) -noinst_PROGRAMS = \ +check_PROGRAMS = \ kmstest kmstest_SOURCES = \ diff --git a/tests/modeprint/Makefile.am b/tests/modeprint/Makefile.am index c4862ac..4291dc5 100644 --- a/tests/modeprint/Makefile.am +++ b/tests/modeprint/Makefile.am @@ -2,7 +2,7 @@ AM_CFLAGS = \ -I$(top_srcdir)/include/drm \ -I$(top_srcdir) -noinst_PROGRAMS = \ +check_PROGRAMS = \ modeprint modeprint_SOURCES = \ diff --git a/tests/modetest/Makefile.am b/tests/modetest/Makefile.am index 2191242..dd43d0c 100644 --- a/tests/modetest/Makefile.am +++ b/tests/modetest/Makefile.am @@ -4,7 +4,7 @@ AM_CFLAGS = \ -I$(top_srcdir) \ $(CAIRO_CFLAGS) -noinst_PROGRAMS = \ +check_PROGRAMS = \ modetest modetest_SOURCES = \ diff --git a/tests/radeon/Makefile.am b/tests/radeon/Makefile.am index 1775669..d4f6755 100644 --- a/tests/radeon/Makefile.am +++ b/tests/radeon/Makefile.am @@ -4,7 +4,7 @@ AM_CFLAGS = \ LDADD = $(top_builddir)/libdrm.la -noinst_PROGRAMS = \ +check_PROGRAMS = \ radeon_ttm radeon_ttm_SOURCES = \ diff --git a/tests/vbltest/Makefile.am b/tests/vbltest/Makefile.am index 77f9037..5886bd1 100644 --- a/tests/vbltest/Makefile.am +++ b/tests/vbltest/Makefile.am @@ -2,7 +2,7 @@ AM_CFLAGS = \ -I$(top_srcdir)/include/drm \ -I$(top_srcdir) -noinst_PROGRAMS = \ +check_PROGRAMS = \ vbltest vbltest_SOURCES = \ -- 1.7.3.4 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] Only build test programs with make check
With my limited knowledge of automake I'm going to have to NACK this patch. Most of these programs are used during driver bring up to test things out, often we also modify them a bit to suit our need. If this patch doesn't make this any harder then disregard my NACK. Cheers Jakob. - Original Message - > Signed-off-by: Matt Turner > --- > tests/kmstest/Makefile.am |2 +- > tests/modeprint/Makefile.am |2 +- > tests/modetest/Makefile.am |2 +- > tests/radeon/Makefile.am|2 +- > tests/vbltest/Makefile.am |2 +- > 5 files changed, 5 insertions(+), 5 deletions(-) > > diff --git a/tests/kmstest/Makefile.am b/tests/kmstest/Makefile.am > index ae562a1..10b9ef3 100644 > --- a/tests/kmstest/Makefile.am > +++ b/tests/kmstest/Makefile.am > @@ -3,7 +3,7 @@ AM_CFLAGS = \ > -I$(top_srcdir)/libkms/ \ > -I$(top_srcdir) > > -noinst_PROGRAMS = \ > +check_PROGRAMS = \ > kmstest > > kmstest_SOURCES = \ > diff --git a/tests/modeprint/Makefile.am > b/tests/modeprint/Makefile.am > index c4862ac..4291dc5 100644 > --- a/tests/modeprint/Makefile.am > +++ b/tests/modeprint/Makefile.am > @@ -2,7 +2,7 @@ AM_CFLAGS = \ > -I$(top_srcdir)/include/drm \ > -I$(top_srcdir) > > -noinst_PROGRAMS = \ > +check_PROGRAMS = \ > modeprint > > modeprint_SOURCES = \ > diff --git a/tests/modetest/Makefile.am b/tests/modetest/Makefile.am > index 2191242..dd43d0c 100644 > --- a/tests/modetest/Makefile.am > +++ b/tests/modetest/Makefile.am > @@ -4,7 +4,7 @@ AM_CFLAGS = \ > -I$(top_srcdir) \ > $(CAIRO_CFLAGS) > > -noinst_PROGRAMS = \ > +check_PROGRAMS = \ > modetest > > modetest_SOURCES = \ > diff --git a/tests/radeon/Makefile.am b/tests/radeon/Makefile.am > index 1775669..d4f6755 100644 > --- a/tests/radeon/Makefile.am > +++ b/tests/radeon/Makefile.am > @@ -4,7 +4,7 @@ AM_CFLAGS = \ > > LDADD = $(top_builddir)/libdrm.la > > -noinst_PROGRAMS = \ > +check_PROGRAMS = \ > radeon_ttm > > radeon_ttm_SOURCES = \ > diff --git a/tests/vbltest/Makefile.am b/tests/vbltest/Makefile.am > index 77f9037..5886bd1 100644 > --- a/tests/vbltest/Makefile.am > +++ b/tests/vbltest/Makefile.am > @@ -2,7 +2,7 @@ AM_CFLAGS = \ > -I$(top_srcdir)/include/drm \ > -I$(top_srcdir) > > -noinst_PROGRAMS = \ > +check_PROGRAMS = \ > vbltest > > vbltest_SOURCES = \ > -- > 1.7.3.4 > > ___ > dri-devel mailing list > dri-devel@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel > ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] Only build test programs with make check
On Sat, Mar 17, 2012 at 2:35 PM, Jakob Bornecrantz wrote: > With my limited knowledge of automake I'm going to have > to NACK this patch. Most of these programs are used during > driver bring up to test things out, often we also modify > them a bit to suit our need. > > If this patch doesn't make this any harder then > disregard my NACK. Harder, like having to run make check? I'm not sure about modeprint, so I can drop that, but the rest look like programs you'd want to run during make check. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)
<#part sign=pgpmime> On Sat, 17 Mar 2012 07:41:14 +, Dave Airlie wrote: > We've had reports on 965GM in Fedora, maybe davej has the specific > bug. Are these reports relatively recent, leading to a possible software bug introduced in the last couple of releases? -- keith.pack...@intel.com ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)
On Sat, Mar 17, 2012 at 6:59 PM, Keith Packard wrote: > <#part sign=pgpmime> > On Sat, 17 Mar 2012 07:41:14 +, Dave Airlie wrote: > >> We've had reports on 965GM in Fedora, maybe davej has the specific >> bug. > > Are these reports relatively recent, leading to a possible software bug > introduced in the last couple of releases? No I think they are since 0-day KMS reports. but its St Patricks day so really not a reliable info source for a couple of days :-) Dave. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] RFC: dma-buf: userspace mmap support
> > dma-buf file descriptor. Userspace access to the buffer should be > > bracketed with DMA_BUF_IOCTL_{PREPARE,FINISH}_ACCESS ioctl calls to > > give the exporting driver a chance to deal with cache synchronization > > and such for cached userspace mappings without resorting to page There should be flags indicating if this is necessary. We don't want extra syscalls on hardware that doesn't need it. The other question is what info is needed as you may only want to poke a few pages out of cache and the prepare/finish on its own gives no info. > E.g. If another device was writing to the buffer, the prepare ioctl > could block until that device had finished accessing that buffer. How do you avoid deadlocks on this ? We need very clear ways to ensure things always complete in some form given multiple buffer owner/requestors and the fact this API has no "prepare-multiple-buffers" support. Alan ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)
On Sat, Mar 17, 2012 at 7:20 PM, Dave Airlie wrote: > On Sat, Mar 17, 2012 at 6:59 PM, Keith Packard wrote: >> <#part sign=pgpmime> >> On Sat, 17 Mar 2012 07:41:14 +, Dave Airlie wrote: >> >>> We've had reports on 965GM in Fedora, maybe davej has the specific >>> bug. >> >> Are these reports relatively recent, leading to a possible software bug >> introduced in the last couple of releases? > > No I think they are since 0-day KMS reports. > > but its St Patricks day so really not a reliable info source for a > couple of days :-) I did however get a flashback in google to this: http://lists.fedoraproject.org/pipermail/scm-commits/2010-July/456636.html Linus don't think we ever did work out why that worked, I wonder if we lost something after that. Dave. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)
Okay found the 965GM report https://bugzilla.kernel.org/show_bug.cgi?id=37142 Dave. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)
On Sat, Mar 17, 2012 at 1:23 PM, Dave Airlie wrote: > > I did however get a flashback in google to this: > > http://lists.fedoraproject.org/pipermail/scm-commits/2010-July/456636.html > > Linus don't think we ever did work out why that worked, I wonder if we > lost something after that. Hmm. Maybe we should stop making the default gfp_mask be GFP_HIGHMEM_MOVABLE (see inode_init_always() in fs/inode.c), and instead add the _MOVABLE flag only for mappings that use "generic_file_mmap()". It does sound a bit scary to just make default mappings use MOVABLE pages, when we know that can be incorrect for some cases. But that would require that filesystems like ext4 etc that don't just use the generic_file_mmap() function would have add do that MOVABLE thing on their own. And the *normal* case is certainly that pages are movable and putting them in themovable zone should be ok. It's just *not* ok if you also map them into some hardware GTT thing or just otherwise keep track of the pages directly, like DRI does. The other possibility is to just make this a shm thing, since it's generally that layer that has the case of "pages allocated with the mapping gfp masks". Hugh Dickins clearly tried to make sure that the DRM initialization of the gfp mask was honored, but maybe there is some case where it is missed. Hugh added a mapping_set_gfp_mask() to i915_gem_alloc_object(), but maybe we have i915 shmem mappings that get set up through some other path. What about the ttm swap storage thing, for example? That also does shmem_file_setup(), without limiting the pages. But I don't think i915 uses ttm, right? Al? Hugh? Opinions? Right now anybody who uses the shmem_read_mapping_page() function is in danger of getting a MOVABLE page by default. Which may or may not be right. That said, I don't see how i915 would get a movable page. It *seems* to do the right setup for the gfp flags of the mapping. Linus ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] RFC: dma-buf: userspace mmap support
On Sat, Mar 17, 2012 at 3:17 PM, Alan Cox wrote: >> > dma-buf file descriptor. Userspace access to the buffer should be >> > bracketed with DMA_BUF_IOCTL_{PREPARE,FINISH}_ACCESS ioctl calls to >> > give the exporting driver a chance to deal with cache synchronization >> > and such for cached userspace mappings without resorting to page > > There should be flags indicating if this is necessary. We don't want extra > syscalls on hardware that doesn't need it. The other question is what > info is needed as you may only want to poke a few pages out of cache and > the prepare/finish on its own gives no info. Well, there isn't really a convenient way to know, for some random code that is just handed a dmabuf fd, what the flags are without passing around an extra param in userspace. So I'd tend to say, just live with the syscall even if it is a no-op (because if you are doing sw access to the buffer, that is anyways some slow/legacy path). But I'm open to suggestions. As far as just peeking/poking a few pages, that is where some later ioctls or additional params could come in, to give some hints. But I wanted to keep it simple to start. >> E.g. If another device was writing to the buffer, the prepare ioctl >> could block until that device had finished accessing that buffer. > > How do you avoid deadlocks on this ? We need very clear ways to ensure > things always complete in some form given multiple buffer > owner/requestors and the fact this API has no "prepare-multiple-buffers" > support. Probably some separate synchronization is needed.. I'm not really sure if prepare/finish (or map/unmap on kernel side) is a the right way to handle that. BR, -R > Alan > ___ > dri-devel mailing list > dri-devel@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)
On Sat, Mar 17, 2012 at 3:09 PM, Hugh Dickins wrote: > > I've got to go out for an hour: I'll digest more and think more about > this when I get back. If someone could explain the original problem > with _MOVABLE, that would help me: I do not believe we actually ever uncovered the original problem with _MOVABLE: the problem was bisected down to the stable-backported version of commit 4bdadb978569 ("drm/i915: Selectively enable self-reclaim"), and I looked at the changes and decided that one of the main ones was the removal of the mapping_set_gfp_mask() - which resulted in __GFP_MOVABLE being on for the mapping. So we just tried re-introducing that, and it fixed the problem. Exactly *why* movable pages are a problem was never all that clear. The assumption was (and I guess still is) that there are physical pointers to the pages in the DRM pages themselves. Quoting from the original thread: "I could easily see that something would get very unhappy and corrupt memory if the suspend-to-disk phase ends up compacting memory and moving the pages around from under the i915 driver." but I didn't actually see why the i915 page pinning would be defeated by __GFP_MOVABLE. The code does get a reference to them afaik. So no, we don't have a real first cause. We just have a band-aid for the symptom. It might be a pure bug in hibernation that just assumes that any pages in MOVABLE zones can always be moved without even checking any refcounts, so not using __GFP_MOVABLE might just be working around problems elsewhere. Or it's entirely possible that the DRM layer does *not* increment the refcounts of the pages properly, and that hibernate with movable pages is just a very efficient way of showing the problems that results in. > A factor which might turn out to be relevant: swapin_readahead() (or > swapoff) brings in pages from swap before it knows who they belong to, > so the swapped-in swapcache pages don't necessarily have the limitations > mapping_set_gfp_mask() has asked for. It's been discussed with Alan, > we know it will need fixing for gma500 when eventually that comes to > be used on machines with more than 4GB, but for now I wasn't aware of > a problem. So i915 shouldn't much care about highmem vs non-highmem, it's just that it does hold on to pages and take their physical address. But as far as I can tell, whenever the physical address is in use, it does have a refcount to the page. So for example, i915_gem_object_get_pages_gtt() will use shmem_read_mapping_page_gfp() which will increment the page count for the page it gets, so all the obj->pages[] entries should have properly incremented page counts. And they get released by i915_gem_object_put_pages_gtt(), but maybe that is called too early while the pages are still in use by the GFX unit... If there is a refcounting issue, hibernate + __GFP_MOVABLE might just be a great way to see it. I don't really know the code, though. Linus ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)
<#part sign=pgpmime> On Sat, 17 Mar 2012 15:52:15 -0700, Linus Torvalds wrote: > I do not believe we actually ever uncovered the original problem with > _MOVABLE: the problem was bisected down to the stable-backported > version of commit 4bdadb978569 ("drm/i915: Selectively enable > self-reclaim"), and I looked at the changes and decided that one of > the main ones was the removal of the mapping_set_gfp_mask() - which > resulted in __GFP_MOVABLE being on for the mapping. Can anyone explain what __GFP_MOVABLE even does? I can't understand what this flag would be for; if the page is locked (with get_page), then the page cannot move. If it isn't locked, then it's subject to swapping, in which case the page will almost certainly move when it returns from disk. Is it that the page won't move if it isn't swapped? That doesn't seem all that useful to me. > but I didn't actually see why the i915 page pinning would be defeated > by __GFP_MOVABLE. The code does get a reference to them afaik. GTT mapping and page locking are done in lock-step in the driver: i915_gem_object_bind_to_gtt i915_gem_object_get_pages_gtt pins the pages i915_gem_gtt_bind_object maps to GTT i915_gem_object_unbind i915_gem_gtt_unbind_object unmaps from GTT i915_gem_object_put_pages_gtt unpins the pages. There are no other code paths to unmapping objects from the GTT or unpinning the pages that I can find. > So for example, i915_gem_object_get_pages_gtt() will use > shmem_read_mapping_page_gfp() which will increment the page count for > the page it gets, so all the obj->pages[] entries should have properly > incremented page counts. And they get released by > i915_gem_object_put_pages_gtt(), but maybe that is called too early > while the pages are still in use by the GFX unit... This seems the most likely problem -- there are so many caches and buffers involved. However, we're seeing troubles on hibernate resume, at which point there isn't any acceleration going on, just two fbdev drivers poking the hardware. Which really reduces the complexity quite a bit; it's just CPU reads/writes through the GTT aperture created for the two console frame buffers. That makes this an interesting place to look for trouble; we can ignore vast areas within the driver that deal with acceleration, at least for this case. -- keith.pack...@intel.com ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [V5 PATCH 1/4] drivers-gpu-drm-allow-to-load-edid-firmware.patch
On Thu, 15 Mar 2012 15:56:24 BST, Carsten Emde said: > Broken monitors and/or broken graphic boards may send erroneous or no > EDID data. This also applies to broken KVM devices that are unable to > correctly forward the EDID data of the connected monitor but invent > their own fantasy data. > Documentation/EDID/HOWTO.txt| 39 + > Documentation/EDID/Makefile | 26 +++ Documented *and* a Makefile to automate it. Nice. :) That addresses my comments.. Acked-by: Valdis Kletnieks pgpsTPtB0bBud.pgp Description: PGP signature ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)
<#part sign=pgpmime> On Sat, 17 Mar 2012 18:44:18 -0700 (PDT), Hugh Dickins wrote: > __GFP_MOVABLE is a hint to page allocation that there's a good likelihood > that this logical page can be migrated elsewhere in physical memory later > on if mm wants, so it's a good idea to allocate it from a physical area of > similarly MOVABLE pages; So, allocating things with __GFP_MOVABLE may just change where in memory the graphics pages get allocated, moving whatever GPU-inspired damage to less sensitive bits of the kernel data. Sounds fabulous! > I keep worrying about the sequence when the machine is powered on again > after hibernation: can i915 get up to anything before it is resumed from > the hibernation image? Well, the frame buffer is presumably still using whatever mapping it had before suspend occurred; is there any way it could be writing through that before the graphics driver was resumed? What I don't understand is the relationship between the boot kernel and the resumed kernel; when does the boot kernel stop writing to the console, and how does it hand off control of the frame buffer at that time. It would be great if we could separate out the boot kernel access to the graphics system from the resumed system -- if the boot kernel was run without the i915 driver loaded at all, and just used VGA text mode, then any damage as a result of resume wouldn't be caused by the boot kernel GTT mappings getting used at the wrong time. -- keith.pack...@intel.com ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Linaro-mm-sig] [PATCH] dma-buf: add get_dma_buf()
Reviewed-by: Kyungmin Park On 3/17/12, Rob Clark wrote: > From: Rob Clark > > Works in a similar way to get_file(), and is needed in cases such as > when the exporter needs to also keep a reference to the dmabuf (that > is later released with a dma_buf_put()), and possibly other similar > cases. > > Signed-off-by: Rob Clark > --- > Minor update on original to add a missing #include > > include/linux/dma-buf.h | 15 +++ > 1 files changed, 15 insertions(+), 0 deletions(-) > > diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h > index 891457a..bc4203dc 100644 > --- a/include/linux/dma-buf.h > +++ b/include/linux/dma-buf.h > @@ -30,6 +30,7 @@ > #include > #include > #include > +#include > > struct dma_buf; > struct dma_buf_attachment; > @@ -110,6 +111,20 @@ struct dma_buf_attachment { > void *priv; > }; > > +/** > + * get_dma_buf - convenience wrapper for get_file. > + * @dmabuf: [in]pointer to dma_buf > + * > + * Increments the reference count on the dma-buf, needed in case of drivers > + * that either need to create additional references to the dmabuf on the > + * kernel side. For example, an exporter that needs to keep a dmabuf ptr > + * so that subsequent exports don't create a new dmabuf. > + */ > +static inline void get_dma_buf(struct dma_buf *dmabuf) > +{ > + get_file(dmabuf->file); > +} > + > #ifdef CONFIG_DMA_SHARED_BUFFER > struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, > struct device *dev); > -- > 1.7.5.4 > > > ___ > Linaro-mm-sig mailing list > Linaro-mm-sig at lists.linaro.org > http://lists.linaro.org/mailman/listinfo/linaro-mm-sig >
i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)
On Fri, Mar 16, 2012 at 11:01 PM, Keith Packard wrote: > <#part sign=pgpmime> > On Fri, 16 Mar 2012 16:47:46 +, Dave Airlie wrote: > >> The hibernate issue is known and I've been hoping someone from Intel >> would run with debugging it, they have a big enough team that I don't >> feel I can expend the personal time to look into it. > > Yeah, I've been chatting with a couple of intel folks; we should have a > test patch ready shortly, but we haven't been able to reproduce anything > like this... > >> Maybe Keith can push someone or maybe I just refuse pull requests >> until one with a fix appears. > > From what I've seen, this is a problem only on Ironlake machines, which > makes me afraid of some weird GTT flushing issue, given the adventures > we had with VT-d on that hardware where we idle the gpu before any GTT > updates. I wonder what would happen if we idled the GPU before any GTT > updates even when VT-d wasn't running... We've had reports on 965GM in Fedora, maybe davej has the specific bug. Dave.
[Bug 45366] Radeon gpu lockups
https://bugs.freedesktop.org/show_bug.cgi?id=45366 --- Comment #5 from Ernst Sj?strand 2012-03-17 03:39:47 PDT --- This might be related to reclocking the GPU because when I tried dynpm it happened more often and it often happens when waking the computer up from DPMS. -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug.
[Bug 29412] fans running at full-speed after resume from suspend with radeon and KMS
https://bugzilla.kernel.org/show_bug.cgi?id=29412 --- Comment #4 from Jon Dowland 2012-03-17 11:27:26 --- Ok retesting v2.6.32-rc1 first, I had trouble building it (gcc 4.6 won't work); trouble getting the initramfs to work, and trouble getting X to start (some kind of race condition / bad failure mode in the gdm3 init script launched ~300 Xorg instances before I killed it). However, I could reproduce the issue without X: single-user mode, no radeon module loaded: fans climbed down as they should. Single-user mode, radeon driver loaded: fans didn't climb down. I'll check v2.6.32-rc1^ next, then I'll collect dmesg/Xorg.0.logs as appropriate. (I confirmed this with 3.3~rc6-1~experimental.1 in the mean time) -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are watching the assignee of the bug.
[Bug 29412] fans running at full-speed after resume from suspend with radeon and KMS
https://bugzilla.kernel.org/show_bug.cgi?id=29412 --- Comment #5 from Jon Dowland 2012-03-17 11:49:13 --- Created an attachment (id=72633) --> (https://bugzilla.kernel.org/attachment.cgi?id=72633) dmesg from v2.6.32-rc1^ Frustratingly, the issue reared its head with v2.6.32-rc1^. This dmesg is from a single user mode session, two suspends with "modprobe radeon" in the middle. It's getting hard to run kernels this old (even though they aren't that old!) notice the error loading the radeon firmware: no idea why it hasn't managed, other kernels do (and it's there). Also udev refuses to start with < 2.6.32. I may have to set up a test root with an older userspace (squeeze perhaps) to go further :( Does this basically invalidate my previous bisect? -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are watching the assignee of the bug.
[PATCH] Only build test programs with make check
Signed-off-by: Matt Turner --- tests/kmstest/Makefile.am |2 +- tests/modeprint/Makefile.am |2 +- tests/modetest/Makefile.am |2 +- tests/radeon/Makefile.am|2 +- tests/vbltest/Makefile.am |2 +- 5 files changed, 5 insertions(+), 5 deletions(-) diff --git a/tests/kmstest/Makefile.am b/tests/kmstest/Makefile.am index ae562a1..10b9ef3 100644 --- a/tests/kmstest/Makefile.am +++ b/tests/kmstest/Makefile.am @@ -3,7 +3,7 @@ AM_CFLAGS = \ -I$(top_srcdir)/libkms/ \ -I$(top_srcdir) -noinst_PROGRAMS = \ +check_PROGRAMS = \ kmstest kmstest_SOURCES = \ diff --git a/tests/modeprint/Makefile.am b/tests/modeprint/Makefile.am index c4862ac..4291dc5 100644 --- a/tests/modeprint/Makefile.am +++ b/tests/modeprint/Makefile.am @@ -2,7 +2,7 @@ AM_CFLAGS = \ -I$(top_srcdir)/include/drm \ -I$(top_srcdir) -noinst_PROGRAMS = \ +check_PROGRAMS = \ modeprint modeprint_SOURCES = \ diff --git a/tests/modetest/Makefile.am b/tests/modetest/Makefile.am index 2191242..dd43d0c 100644 --- a/tests/modetest/Makefile.am +++ b/tests/modetest/Makefile.am @@ -4,7 +4,7 @@ AM_CFLAGS = \ -I$(top_srcdir) \ $(CAIRO_CFLAGS) -noinst_PROGRAMS = \ +check_PROGRAMS = \ modetest modetest_SOURCES = \ diff --git a/tests/radeon/Makefile.am b/tests/radeon/Makefile.am index 1775669..d4f6755 100644 --- a/tests/radeon/Makefile.am +++ b/tests/radeon/Makefile.am @@ -4,7 +4,7 @@ AM_CFLAGS = \ LDADD = $(top_builddir)/libdrm.la -noinst_PROGRAMS = \ +check_PROGRAMS = \ radeon_ttm radeon_ttm_SOURCES = \ diff --git a/tests/vbltest/Makefile.am b/tests/vbltest/Makefile.am index 77f9037..5886bd1 100644 --- a/tests/vbltest/Makefile.am +++ b/tests/vbltest/Makefile.am @@ -2,7 +2,7 @@ AM_CFLAGS = \ -I$(top_srcdir)/include/drm \ -I$(top_srcdir) -noinst_PROGRAMS = \ +check_PROGRAMS = \ vbltest vbltest_SOURCES = \ -- 1.7.3.4
[PATCH] Only build test programs with make check
With my limited knowledge of automake I'm going to have to NACK this patch. Most of these programs are used during driver bring up to test things out, often we also modify them a bit to suit our need. If this patch doesn't make this any harder then disregard my NACK. Cheers Jakob. - Original Message - > Signed-off-by: Matt Turner > --- > tests/kmstest/Makefile.am |2 +- > tests/modeprint/Makefile.am |2 +- > tests/modetest/Makefile.am |2 +- > tests/radeon/Makefile.am|2 +- > tests/vbltest/Makefile.am |2 +- > 5 files changed, 5 insertions(+), 5 deletions(-) > > diff --git a/tests/kmstest/Makefile.am b/tests/kmstest/Makefile.am > index ae562a1..10b9ef3 100644 > --- a/tests/kmstest/Makefile.am > +++ b/tests/kmstest/Makefile.am > @@ -3,7 +3,7 @@ AM_CFLAGS = \ > -I$(top_srcdir)/libkms/ \ > -I$(top_srcdir) > > -noinst_PROGRAMS = \ > +check_PROGRAMS = \ > kmstest > > kmstest_SOURCES = \ > diff --git a/tests/modeprint/Makefile.am > b/tests/modeprint/Makefile.am > index c4862ac..4291dc5 100644 > --- a/tests/modeprint/Makefile.am > +++ b/tests/modeprint/Makefile.am > @@ -2,7 +2,7 @@ AM_CFLAGS = \ > -I$(top_srcdir)/include/drm \ > -I$(top_srcdir) > > -noinst_PROGRAMS = \ > +check_PROGRAMS = \ > modeprint > > modeprint_SOURCES = \ > diff --git a/tests/modetest/Makefile.am b/tests/modetest/Makefile.am > index 2191242..dd43d0c 100644 > --- a/tests/modetest/Makefile.am > +++ b/tests/modetest/Makefile.am > @@ -4,7 +4,7 @@ AM_CFLAGS = \ > -I$(top_srcdir) \ > $(CAIRO_CFLAGS) > > -noinst_PROGRAMS = \ > +check_PROGRAMS = \ > modetest > > modetest_SOURCES = \ > diff --git a/tests/radeon/Makefile.am b/tests/radeon/Makefile.am > index 1775669..d4f6755 100644 > --- a/tests/radeon/Makefile.am > +++ b/tests/radeon/Makefile.am > @@ -4,7 +4,7 @@ AM_CFLAGS = \ > > LDADD = $(top_builddir)/libdrm.la > > -noinst_PROGRAMS = \ > +check_PROGRAMS = \ > radeon_ttm > > radeon_ttm_SOURCES = \ > diff --git a/tests/vbltest/Makefile.am b/tests/vbltest/Makefile.am > index 77f9037..5886bd1 100644 > --- a/tests/vbltest/Makefile.am > +++ b/tests/vbltest/Makefile.am > @@ -2,7 +2,7 @@ AM_CFLAGS = \ > -I$(top_srcdir)/include/drm \ > -I$(top_srcdir) > > -noinst_PROGRAMS = \ > +check_PROGRAMS = \ > vbltest > > vbltest_SOURCES = \ > -- > 1.7.3.4 > > ___ > dri-devel mailing list > dri-devel at lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel >
[PATCH] Only build test programs with make check
On Sat, Mar 17, 2012 at 2:35 PM, Jakob Bornecrantz wrote: > With my limited knowledge of automake I'm going to have > to NACK this patch. Most of these programs are used during > driver bring up to test things out, often we also modify > them a bit to suit our need. > > If this patch doesn't make this any harder then > disregard my NACK. Harder, like having to run make check? I'm not sure about modeprint, so I can drop that, but the rest look like programs you'd want to run during make check.
i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)
<#part sign=pgpmime> On Sat, 17 Mar 2012 07:41:14 +, Dave Airlie wrote: > We've had reports on 965GM in Fedora, maybe davej has the specific > bug. Are these reports relatively recent, leading to a possible software bug introduced in the last couple of releases? -- keith.packard at intel.com
i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)
On Sat, Mar 17, 2012 at 6:59 PM, Keith Packard wrote: > <#part sign=pgpmime> > On Sat, 17 Mar 2012 07:41:14 +, Dave Airlie wrote: > >> We've had reports on 965GM in Fedora, maybe davej has the specific >> bug. > > Are these reports relatively recent, leading to a possible software bug > introduced in the last couple of releases? No I think they are since 0-day KMS reports. but its St Patricks day so really not a reliable info source for a couple of days :-) Dave.
[PATCH] RFC: dma-buf: userspace mmap support
> > dma-buf file descriptor. Userspace access to the buffer should be > > bracketed with DMA_BUF_IOCTL_{PREPARE,FINISH}_ACCESS ioctl calls to > > give the exporting driver a chance to deal with cache synchronization > > and such for cached userspace mappings without resorting to page There should be flags indicating if this is necessary. We don't want extra syscalls on hardware that doesn't need it. The other question is what info is needed as you may only want to poke a few pages out of cache and the prepare/finish on its own gives no info. > E.g. If another device was writing to the buffer, the prepare ioctl > could block until that device had finished accessing that buffer. How do you avoid deadlocks on this ? We need very clear ways to ensure things always complete in some form given multiple buffer owner/requestors and the fact this API has no "prepare-multiple-buffers" support. Alan
i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)
On Sat, Mar 17, 2012 at 7:20 PM, Dave Airlie wrote: > On Sat, Mar 17, 2012 at 6:59 PM, Keith Packard wrote: >> <#part sign=pgpmime> >> On Sat, 17 Mar 2012 07:41:14 +, Dave Airlie wrote: >> >>> We've had reports on 965GM in Fedora, maybe davej has the specific >>> bug. >> >> Are these reports relatively recent, leading to a possible software bug >> introduced in the last couple of releases? > > No I think they are since 0-day KMS reports. > > but its St Patricks day so really not a reliable info source for a > couple of days :-) I did however get a flashback in google to this: http://lists.fedoraproject.org/pipermail/scm-commits/2010-July/456636.html Linus don't think we ever did work out why that worked, I wonder if we lost something after that. Dave.
i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)
Okay found the 965GM report https://bugzilla.kernel.org/show_bug.cgi?id=37142 Dave.
i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)
On Sat, Mar 17, 2012 at 1:23 PM, Dave Airlie wrote: > > I did however get a flashback in google to this: > > http://lists.fedoraproject.org/pipermail/scm-commits/2010-July/456636.html > > Linus don't think we ever did work out why that worked, I wonder if we > lost something after that. Hmm. Maybe we should stop making the default gfp_mask be GFP_HIGHMEM_MOVABLE (see inode_init_always() in fs/inode.c), and instead add the _MOVABLE flag only for mappings that use "generic_file_mmap()". It does sound a bit scary to just make default mappings use MOVABLE pages, when we know that can be incorrect for some cases. But that would require that filesystems like ext4 etc that don't just use the generic_file_mmap() function would have add do that MOVABLE thing on their own. And the *normal* case is certainly that pages are movable and putting them in themovable zone should be ok. It's just *not* ok if you also map them into some hardware GTT thing or just otherwise keep track of the pages directly, like DRI does. The other possibility is to just make this a shm thing, since it's generally that layer that has the case of "pages allocated with the mapping gfp masks". Hugh Dickins clearly tried to make sure that the DRM initialization of the gfp mask was honored, but maybe there is some case where it is missed. Hugh added a mapping_set_gfp_mask() to i915_gem_alloc_object(), but maybe we have i915 shmem mappings that get set up through some other path. What about the ttm swap storage thing, for example? That also does shmem_file_setup(), without limiting the pages. But I don't think i915 uses ttm, right? Al? Hugh? Opinions? Right now anybody who uses the shmem_read_mapping_page() function is in danger of getting a MOVABLE page by default. Which may or may not be right. That said, I don't see how i915 would get a movable page. It *seems* to do the right setup for the gfp flags of the mapping. Linus
[PATCH] RFC: dma-buf: userspace mmap support
On Sat, Mar 17, 2012 at 3:17 PM, Alan Cox wrote: >> > dma-buf file descriptor. ?Userspace access to the buffer should be >> > bracketed with DMA_BUF_IOCTL_{PREPARE,FINISH}_ACCESS ioctl calls to >> > give the exporting driver a chance to deal with cache synchronization >> > and such for cached userspace mappings without resorting to page > > There should be flags indicating if this is necessary. We don't want extra > syscalls on hardware that doesn't need it. The other question is what > info is needed as you may only want to poke a few pages out of cache and > the prepare/finish on its own gives no info. Well, there isn't really a convenient way to know, for some random code that is just handed a dmabuf fd, what the flags are without passing around an extra param in userspace. So I'd tend to say, just live with the syscall even if it is a no-op (because if you are doing sw access to the buffer, that is anyways some slow/legacy path). But I'm open to suggestions. As far as just peeking/poking a few pages, that is where some later ioctls or additional params could come in, to give some hints. But I wanted to keep it simple to start. >> E.g. If another device was writing to the buffer, the prepare ioctl >> could block until that device had finished accessing that buffer. > > How do you avoid deadlocks on this ? We need very clear ways to ensure > things always complete in some form given multiple buffer > owner/requestors and the fact this API has no "prepare-multiple-buffers" > support. Probably some separate synchronization is needed.. I'm not really sure if prepare/finish (or map/unmap on kernel side) is a the right way to handle that. BR, -R > Alan > ___ > dri-devel mailing list > dri-devel at lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel
i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)
On Sat, Mar 17, 2012 at 3:09 PM, Hugh Dickins wrote: > > I've got to go out for an hour: I'll digest more and think more about > this when I get back. ?If someone could explain the original problem > with _MOVABLE, that would help me: I do not believe we actually ever uncovered the original problem with _MOVABLE: the problem was bisected down to the stable-backported version of commit 4bdadb978569 ("drm/i915: Selectively enable self-reclaim"), and I looked at the changes and decided that one of the main ones was the removal of the mapping_set_gfp_mask() - which resulted in __GFP_MOVABLE being on for the mapping. So we just tried re-introducing that, and it fixed the problem. Exactly *why* movable pages are a problem was never all that clear. The assumption was (and I guess still is) that there are physical pointers to the pages in the DRM pages themselves. Quoting from the original thread: "I could easily see that something would get very unhappy and corrupt memory if the suspend-to-disk phase ends up compacting memory and moving the pages around from under the i915 driver." but I didn't actually see why the i915 page pinning would be defeated by __GFP_MOVABLE. The code does get a reference to them afaik. So no, we don't have a real first cause. We just have a band-aid for the symptom. It might be a pure bug in hibernation that just assumes that any pages in MOVABLE zones can always be moved without even checking any refcounts, so not using __GFP_MOVABLE might just be working around problems elsewhere. Or it's entirely possible that the DRM layer does *not* increment the refcounts of the pages properly, and that hibernate with movable pages is just a very efficient way of showing the problems that results in. > A factor which might turn out to be relevant: swapin_readahead() (or > swapoff) brings in pages from swap before it knows who they belong to, > so the swapped-in swapcache pages don't necessarily have the limitations > mapping_set_gfp_mask() has asked for. ?It's been discussed with Alan, > we know it will need fixing for gma500 when eventually that comes to > be used on machines with more than 4GB, but for now I wasn't aware of > a problem. So i915 shouldn't much care about highmem vs non-highmem, it's just that it does hold on to pages and take their physical address. But as far as I can tell, whenever the physical address is in use, it does have a refcount to the page. So for example, i915_gem_object_get_pages_gtt() will use shmem_read_mapping_page_gfp() which will increment the page count for the page it gets, so all the obj->pages[] entries should have properly incremented page counts. And they get released by i915_gem_object_put_pages_gtt(), but maybe that is called too early while the pages are still in use by the GFX unit... If there is a refcounting issue, hibernate + __GFP_MOVABLE might just be a great way to see it. I don't really know the code, though. Linus
i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)
<#part sign=pgpmime> On Sat, 17 Mar 2012 15:52:15 -0700, Linus Torvalds wrote: > I do not believe we actually ever uncovered the original problem with > _MOVABLE: the problem was bisected down to the stable-backported > version of commit 4bdadb978569 ("drm/i915: Selectively enable > self-reclaim"), and I looked at the changes and decided that one of > the main ones was the removal of the mapping_set_gfp_mask() - which > resulted in __GFP_MOVABLE being on for the mapping. Can anyone explain what __GFP_MOVABLE even does? I can't understand what this flag would be for; if the page is locked (with get_page), then the page cannot move. If it isn't locked, then it's subject to swapping, in which case the page will almost certainly move when it returns from disk. Is it that the page won't move if it isn't swapped? That doesn't seem all that useful to me. > but I didn't actually see why the i915 page pinning would be defeated > by __GFP_MOVABLE. The code does get a reference to them afaik. GTT mapping and page locking are done in lock-step in the driver: i915_gem_object_bind_to_gtt i915_gem_object_get_pages_gtt pins the pages i915_gem_gtt_bind_object maps to GTT i915_gem_object_unbind i915_gem_gtt_unbind_object unmaps from GTT i915_gem_object_put_pages_gtt unpins the pages. There are no other code paths to unmapping objects from the GTT or unpinning the pages that I can find. > So for example, i915_gem_object_get_pages_gtt() will use > shmem_read_mapping_page_gfp() which will increment the page count for > the page it gets, so all the obj->pages[] entries should have properly > incremented page counts. And they get released by > i915_gem_object_put_pages_gtt(), but maybe that is called too early > while the pages are still in use by the GFX unit... This seems the most likely problem -- there are so many caches and buffers involved. However, we're seeing troubles on hibernate resume, at which point there isn't any acceleration going on, just two fbdev drivers poking the hardware. Which really reduces the complexity quite a bit; it's just CPU reads/writes through the GTT aperture created for the two console frame buffers. That makes this an interesting place to look for trouble; we can ignore vast areas within the driver that deal with acceleration, at least for this case. -- keith.packard at intel.com
[V5 PATCH 1/4] drivers-gpu-drm-allow-to-load-edid-firmware.patch
On Thu, 15 Mar 2012 15:56:24 BST, Carsten Emde said: > Broken monitors and/or broken graphic boards may send erroneous or no > EDID data. This also applies to broken KVM devices that are unable to > correctly forward the EDID data of the connected monitor but invent > their own fantasy data. > Documentation/EDID/HOWTO.txt| 39 + > Documentation/EDID/Makefile | 26 +++ Documented *and* a Makefile to automate it. Nice. :) That addresses my comments.. Acked-by: Valdis Kletnieks -- next part -- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 865 bytes Desc: not available URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20120317/dbda4b5b/attachment.pgp>
i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)
<#part sign=pgpmime> On Sat, 17 Mar 2012 18:44:18 -0700 (PDT), Hugh Dickins wrote: > __GFP_MOVABLE is a hint to page allocation that there's a good likelihood > that this logical page can be migrated elsewhere in physical memory later > on if mm wants, so it's a good idea to allocate it from a physical area of > similarly MOVABLE pages; So, allocating things with __GFP_MOVABLE may just change where in memory the graphics pages get allocated, moving whatever GPU-inspired damage to less sensitive bits of the kernel data. Sounds fabulous! > I keep worrying about the sequence when the machine is powered on again > after hibernation: can i915 get up to anything before it is resumed from > the hibernation image? Well, the frame buffer is presumably still using whatever mapping it had before suspend occurred; is there any way it could be writing through that before the graphics driver was resumed? What I don't understand is the relationship between the boot kernel and the resumed kernel; when does the boot kernel stop writing to the console, and how does it hand off control of the frame buffer at that time. It would be great if we could separate out the boot kernel access to the graphics system from the resumed system -- if the boot kernel was run without the i915 driver loaded at all, and just used VGA text mode, then any damage as a result of resume wouldn't be caused by the boot kernel GTT mappings getting used at the wrong time. -- keith.packard at intel.com
[PATCH] drivers/gpu/drm/radeon/radeon_cs.c: eliminate possible double free
From: Julia Lawall The function radeon_cs_parser_init is only called from two places, in drivers/gpu/drm/radeon/radeon_cs.c and drivers/gpu/drm/radeon/r600_cs.c. In each case, if the call fails another function is called that frees all of the kdata and dpage information in the chunks array. So this information should not be freed in radeon_cs_parser_init as well. Signed-off-by: Julia Lawall --- drivers/gpu/drm/radeon/radeon_cs.c | 16 ++-- 1 file changed, 2 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_cs.c b/drivers/gpu/drm/radeon/radeon_cs.c index 9b4124e..d9d9f5a 100644 --- a/drivers/gpu/drm/radeon/radeon_cs.c +++ b/drivers/gpu/drm/radeon/radeon_cs.c @@ -243,20 +243,11 @@ int radeon_cs_parser_init(struct radeon_cs_parser *p, void *data) if ((p->cs_flags & RADEON_CS_USE_VM) && !p->rdev->vm_manager.enabled) { DRM_ERROR("VM not active on asic!\n"); - if (p->chunk_relocs_idx != -1) - kfree(p->chunks[p->chunk_relocs_idx].kdata); - if (p->chunk_flags_idx != -1) - kfree(p->chunks[p->chunk_flags_idx].kdata); return -EINVAL; } - if (radeon_cs_get_ring(p, ring, priority)) { - if (p->chunk_relocs_idx != -1) - kfree(p->chunks[p->chunk_relocs_idx].kdata); - if (p->chunk_flags_idx != -1) - kfree(p->chunks[p->chunk_flags_idx].kdata); + if (radeon_cs_get_ring(p, ring, priority)) return -EINVAL; - } /* deal with non-vm */ @@ -271,11 +262,8 @@ int radeon_cs_parser_init(struct radeon_cs_parser *p, void *data) p->chunks[p->chunk_ib_idx].kpage[0] = kmalloc(PAGE_SIZE, GFP_KERNEL); p->chunks[p->chunk_ib_idx].kpage[1] = kmalloc(PAGE_SIZE, GFP_KERNEL); if (p->chunks[p->chunk_ib_idx].kpage[0] == NULL || - p->chunks[p->chunk_ib_idx].kpage[1] == NULL) { - kfree(p->chunks[p->chunk_ib_idx].kpage[0]); - kfree(p->chunks[p->chunk_ib_idx].kpage[1]); + p->chunks[p->chunk_ib_idx].kpage[1] == NULL) return -ENOMEM; - } p->chunks[p->chunk_ib_idx].kpage_idx[0] = -1; p->chunks[p->chunk_ib_idx].kpage_idx[1] = -1; p->chunks[p->chunk_ib_idx].last_copied_page = -1;
[PATCH] drivers/gpu/drm/savage/savage_state.c: add missing kfree
From: Julia Lawall Most of the error handling code in this function frees the buffers kcmd_addr, kvb_addr, and kbox_addr allocated at the beginning of this function. These two branches are changed to do the same. Signed-off-by: Julia Lawall --- drivers/gpu/drm/savage/savage_state.c |5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/savage/savage_state.c b/drivers/gpu/drm/savage/savage_state.c index 8a3e315..031aaaf 100644 --- a/drivers/gpu/drm/savage/savage_state.c +++ b/drivers/gpu/drm/savage/savage_state.c @@ -1057,7 +1057,8 @@ int savage_bci_cmdbuf(struct drm_device *dev, void *data, struct drm_file *file_ DRM_ERROR("indexed drawing command extends " "beyond end of command buffer\n"); DMA_FLUSH(); - return -EINVAL; + ret = -EINVAL; + goto done; } /* fall through */ case SAVAGE_CMD_DMA_PRIM: @@ -1076,7 +1077,7 @@ int savage_bci_cmdbuf(struct drm_device *dev, void *data, struct drm_file *file_ cmdbuf->vb_stride, cmdbuf->nbox, cmdbuf->box_addr); if (ret != 0) - return ret; + goto done; first_draw_cmd = NULL; } }
i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)
On Sat, 17 Mar 2012, Linus Torvalds wrote: > On Sat, Mar 17, 2012 at 1:23 PM, Dave Airlie wrote: > > > > I did however get a flashback in google to this: > > > > http://lists.fedoraproject.org/pipermail/scm-commits/2010-July/456636.html > > > > Linus don't think we ever did work out why that worked, I wonder if we > > lost something after that. > > Hmm. Maybe we should stop making the default gfp_mask be > GFP_HIGHMEM_MOVABLE (see inode_init_always() in fs/inode.c), and > instead add the _MOVABLE flag only for mappings that use > "generic_file_mmap()". > > It does sound a bit scary to just make default mappings use MOVABLE > pages, when we know that can be incorrect for some cases. > > But that would require that filesystems like ext4 etc that don't just > use the generic_file_mmap() function would have add do that MOVABLE > thing on their own. And the *normal* case is certainly that pages are > movable and putting them in themovable zone should be ok. It's just > *not* ok if you also map them into some hardware GTT thing or just > otherwise keep track of the pages directly, like DRI does. > > The other possibility is to just make this a shm thing, since it's > generally that layer that has the case of "pages allocated with the > mapping gfp masks". Hugh Dickins clearly tried to make sure that the > DRM initialization of the gfp mask was honored, but maybe there is > some case where it is missed. Hugh added a mapping_set_gfp_mask() to > i915_gem_alloc_object(), but maybe we have i915 shmem mappings that > get set up through some other path. > > What about the ttm swap storage thing, for example? That also does > shmem_file_setup(), without limiting the pages. But I don't think i915 > uses ttm, right? > > Al? Hugh? Opinions? Right now anybody who uses the > shmem_read_mapping_page() function is in danger of getting a MOVABLE > page by default. Which may or may not be right. > > That said, I don't see how i915 would get a movable page. It *seems* > to do the right setup for the gfp flags of the mapping. Yes, i915_gem_alloc_object() does its own mapping_set_gfp_mask(mapping, GFP_HIGHUSER | __GFP_RECLAIMABLE), which should be as good as ever. I've got to go out for an hour: I'll digest more and think more about this when I get back. If someone could explain the original problem with _MOVABLE, that would help me: I didn't see an explanation in the patch or in the bugzilla. I would expect using _MOVABLE for i915_gem would render a block of otherwise movable pages immovable because of the GEM pages pinned, but I wouldn't expect them to move in mysterious ways - or are references held to the pages without them being pinned??. A factor which might turn out to be relevant: swapin_readahead() (or swapoff) brings in pages from swap before it knows who they belong to, so the swapped-in swapcache pages don't necessarily have the limitations mapping_set_gfp_mask() has asked for. It's been discussed with Alan, we know it will need fixing for gma500 when eventually that comes to be used on machines with more than 4GB, but for now I wasn't aware of a problem. Hugh
i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)
On Sat, 17 Mar 2012, Keith Packard wrote: > On Sat, 17 Mar 2012 15:52:15 -0700, Linus Torvalds linux-foundation.org> wrote: > > > I do not believe we actually ever uncovered the original problem with > > _MOVABLE: the problem was bisected down to the stable-backported > > version of commit 4bdadb978569 ("drm/i915: Selectively enable > > self-reclaim"), and I looked at the changes and decided that one of > > the main ones was the removal of the mapping_set_gfp_mask() - which > > resulted in __GFP_MOVABLE being on for the mapping. > > Can anyone explain what __GFP_MOVABLE even does? I can't understand what > this flag would be for; if the page is locked (with get_page), then the > page cannot move. If it isn't locked, then it's subject to swapping, in > which case the page will almost certainly move when it returns from > disk. Is it that the page won't move if it isn't swapped? That doesn't > seem all that useful to me. __GFP_MOVABLE is a hint to page allocation that there's a good likelihood that this logical page can be migrated elsewhere in physical memory later on if mm wants, so it's a good idea to allocate it from a physical area of similarly MOVABLE pages; then if later on someone wants a large contiguous area for something (or wants to hot-unplug that memory), it should be easy to clear the whole area out, moving existing pages elsewhere. (I think that's right: several questions come to me as I write it, but now is not the time to research all those details.) Page migration can only be done later if it can account for all of page_count(page). > > > but I didn't actually see why the i915 page pinning would be defeated > > by __GFP_MOVABLE. The code does get a reference to them afaik. > > GTT mapping and page locking are done in lock-step in the driver: > > i915_gem_object_bind_to_gtt > i915_gem_object_get_pages_gtt > pins the pages > i915_gem_gtt_bind_object > maps to GTT > > i915_gem_object_unbind > i915_gem_gtt_unbind_object > unmaps from GTT > i915_gem_object_put_pages_gtt > unpins the pages. > > There are no other code paths to unmapping objects from the GTT or > unpinning the pages that I can find. > > > So for example, i915_gem_object_get_pages_gtt() will use > > shmem_read_mapping_page_gfp() which will increment the page count for > > the page it gets, so all the obj->pages[] entries should have properly > > incremented page counts. And they get released by > > i915_gem_object_put_pages_gtt(), but maybe that is called too early > > while the pages are still in use by the GFX unit... > > This seems the most likely problem -- there are so many caches and > buffers involved. However, we're seeing troubles on hibernate resume, at > which point there isn't any acceleration going on, just two fbdev > drivers poking the hardware. Which really reduces the complexity quite a > bit; it's just CPU reads/writes through the GTT aperture created for the > two console frame buffers. That makes this an interesting place to look > for trouble; we can ignore vast areas within the driver that deal with > acceleration, at least for this case. I keep worrying about the sequence when the machine is powered on again after hibernation: can i915 get up to anything before it is resumed from the hibernation image? Get to use certain pages at that stage, then continue to poke at them after the hibernation image is restored (which changes the story of what pages are free and what are used for what): lacking some kind of flush? Hugh
i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)
Added Rafael to the Cc: Rafael, we're pondering over one or more of these recurrent threads about corruption after resume, seemingly related to i915. On Sat, 17 Mar 2012, Keith Packard wrote: > On Sat, 17 Mar 2012 18:44:18 -0700 (PDT), Hugh Dickins > wrote: > > I keep worrying about the sequence when the machine is powered on again > > after hibernation: can i915 get up to anything before it is resumed from > > the hibernation image? > > Well, the frame buffer is presumably still using whatever mapping it had > before suspend occurred; is there any way it could be writing through > that before the graphics driver was resumed? It's hibernation restore here, so I don't think it could be using the mapping from before hibernation until after resuming from hibernation snapshot: it would be using the rebooting kernel's mapping until then. > > What I don't understand is the relationship between the boot kernel and > the resumed kernel; when does the boot kernel stop writing to the > console, and how does it hand off control of the frame buffer at that > time. I believe the handoff point comes in the late initcall software_resume(): which loads the image and calls hibernation_restore -> resume_target_kernel -> swsusp_arch_resume, which emerges into the restored hibernation image. As a late initcall, I imagine some work has already been done via the framebuffer, but I have no conception of what kind of mappings that involves (would shmem objects come into it at all? and is that even a relevant question, could enough damage be done without them?), nor whether they're properly torn down before emerging into the hibernimage. > > It would be great if we could separate out the boot kernel access to the > graphics system from the resumed system -- if the boot kernel was run > without the i915 driver loaded at all, and just used VGA text mode, then > any damage as a result of resume wouldn't be caused by the boot kernel > GTT mappings getting used at the wrong time. But you're giving my worry more credence than it deserves there: we don't have any evidence that this is where the problem lies, that's just a suspicion of mine at the moment. Hugh