Re: i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)

2012-03-17 Thread Dave Airlie
On Fri, Mar 16, 2012 at 11:01 PM, Keith Packard  wrote:
> <#part sign=pgpmime>
> On Fri, 16 Mar 2012 16:47:46 +, Dave Airlie  wrote:
>
>> The hibernate issue is known and I've been hoping someone from Intel
>> would run with debugging it, they have a big enough team that I don't
>> feel I can expend the personal time to look into it.
>
> Yeah, I've been chatting with a couple of intel folks; we should have a
> test patch ready shortly, but we haven't been able to reproduce anything
> like this...
>
>> Maybe Keith can push someone or maybe I just refuse pull requests
>> until one with a fix appears.
>
> From what I've seen, this is a problem only on Ironlake machines, which
> makes me afraid of some weird GTT flushing issue, given the adventures
> we had with VT-d on that hardware where we idle the gpu before any GTT
> updates. I wonder what would happen if we idled the GPU before any GTT
> updates even when VT-d wasn't running...

We've had reports on 965GM in Fedora, maybe davej has the specific bug.

Dave.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 45366] Radeon gpu lockups

2012-03-17 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=45366

--- Comment #5 from Ernst Sjöstrand  2012-03-17 03:39:47 PDT 
---
This might be related to reclocking the GPU because when I tried dynpm it
happened more often and it often happens when waking the computer up from DPMS.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 29412] fans running at full-speed after resume from suspend with radeon and KMS

2012-03-17 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=29412





--- Comment #4 from Jon Dowland   
2012-03-17 11:27:26 ---
Ok retesting v2.6.32-rc1 first, I had trouble building it (gcc 4.6 won't work);
trouble getting the initramfs to work, and trouble getting X to start (some
kind of race condition / bad failure mode in the gdm3 init script launched ~300
Xorg instances before I killed it).

However, I could reproduce the issue without X: single-user mode, no radeon
module loaded: fans climbed down as they should. Single-user mode, radeon
driver loaded: fans didn't climb down.

I'll check v2.6.32-rc1^ next, then I'll collect dmesg/Xorg.0.logs as
appropriate.

(I confirmed this with 3.3~rc6-1~experimental.1 in the mean time)

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 29412] fans running at full-speed after resume from suspend with radeon and KMS

2012-03-17 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=29412





--- Comment #5 from Jon Dowland   
2012-03-17 11:49:13 ---
Created an attachment (id=72633)
 --> (https://bugzilla.kernel.org/attachment.cgi?id=72633)
dmesg from v2.6.32-rc1^

Frustratingly, the issue reared its head with v2.6.32-rc1^.  This dmesg is from
a single user mode session, two suspends with "modprobe radeon" in the middle.

It's getting hard to run kernels this old (even though they aren't that old!)
notice the error loading the radeon firmware: no idea why it hasn't managed,
other kernels do (and it's there).  Also udev refuses to start with < 2.6.32.

I may have to set up a test root with an older userspace (squeeze perhaps) to
go further :(

Does this basically invalidate my previous bisect?

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH] Only build test programs with make check

2012-03-17 Thread Matt Turner
Signed-off-by: Matt Turner 
---
 tests/kmstest/Makefile.am   |2 +-
 tests/modeprint/Makefile.am |2 +-
 tests/modetest/Makefile.am  |2 +-
 tests/radeon/Makefile.am|2 +-
 tests/vbltest/Makefile.am   |2 +-
 5 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/tests/kmstest/Makefile.am b/tests/kmstest/Makefile.am
index ae562a1..10b9ef3 100644
--- a/tests/kmstest/Makefile.am
+++ b/tests/kmstest/Makefile.am
@@ -3,7 +3,7 @@ AM_CFLAGS = \
-I$(top_srcdir)/libkms/ \
-I$(top_srcdir)
 
-noinst_PROGRAMS = \
+check_PROGRAMS = \
kmstest
 
 kmstest_SOURCES = \
diff --git a/tests/modeprint/Makefile.am b/tests/modeprint/Makefile.am
index c4862ac..4291dc5 100644
--- a/tests/modeprint/Makefile.am
+++ b/tests/modeprint/Makefile.am
@@ -2,7 +2,7 @@ AM_CFLAGS = \
-I$(top_srcdir)/include/drm \
-I$(top_srcdir)
 
-noinst_PROGRAMS = \
+check_PROGRAMS = \
modeprint
 
 modeprint_SOURCES = \
diff --git a/tests/modetest/Makefile.am b/tests/modetest/Makefile.am
index 2191242..dd43d0c 100644
--- a/tests/modetest/Makefile.am
+++ b/tests/modetest/Makefile.am
@@ -4,7 +4,7 @@ AM_CFLAGS = \
-I$(top_srcdir) \
$(CAIRO_CFLAGS)
 
-noinst_PROGRAMS = \
+check_PROGRAMS = \
modetest
 
 modetest_SOURCES = \
diff --git a/tests/radeon/Makefile.am b/tests/radeon/Makefile.am
index 1775669..d4f6755 100644
--- a/tests/radeon/Makefile.am
+++ b/tests/radeon/Makefile.am
@@ -4,7 +4,7 @@ AM_CFLAGS = \
 
 LDADD = $(top_builddir)/libdrm.la
 
-noinst_PROGRAMS = \
+check_PROGRAMS = \
radeon_ttm
 
 radeon_ttm_SOURCES = \
diff --git a/tests/vbltest/Makefile.am b/tests/vbltest/Makefile.am
index 77f9037..5886bd1 100644
--- a/tests/vbltest/Makefile.am
+++ b/tests/vbltest/Makefile.am
@@ -2,7 +2,7 @@ AM_CFLAGS = \
-I$(top_srcdir)/include/drm \
-I$(top_srcdir)
 
-noinst_PROGRAMS = \
+check_PROGRAMS = \
vbltest
 
 vbltest_SOURCES = \
-- 
1.7.3.4

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] Only build test programs with make check

2012-03-17 Thread Jakob Bornecrantz
With my limited knowledge of automake I'm going to have
to NACK this patch. Most of these programs are used during
driver bring up to test things out, often we also modify
them a bit to suit our need.

If this patch doesn't make this any harder then
disregard my NACK.

Cheers Jakob.

- Original Message -
> Signed-off-by: Matt Turner 
> ---
>  tests/kmstest/Makefile.am   |2 +-
>  tests/modeprint/Makefile.am |2 +-
>  tests/modetest/Makefile.am  |2 +-
>  tests/radeon/Makefile.am|2 +-
>  tests/vbltest/Makefile.am   |2 +-
>  5 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/tests/kmstest/Makefile.am b/tests/kmstest/Makefile.am
> index ae562a1..10b9ef3 100644
> --- a/tests/kmstest/Makefile.am
> +++ b/tests/kmstest/Makefile.am
> @@ -3,7 +3,7 @@ AM_CFLAGS = \
>   -I$(top_srcdir)/libkms/ \
>   -I$(top_srcdir)
>  
> -noinst_PROGRAMS = \
> +check_PROGRAMS = \
>   kmstest
>  
>  kmstest_SOURCES = \
> diff --git a/tests/modeprint/Makefile.am
> b/tests/modeprint/Makefile.am
> index c4862ac..4291dc5 100644
> --- a/tests/modeprint/Makefile.am
> +++ b/tests/modeprint/Makefile.am
> @@ -2,7 +2,7 @@ AM_CFLAGS = \
>   -I$(top_srcdir)/include/drm \
>   -I$(top_srcdir)
>  
> -noinst_PROGRAMS = \
> +check_PROGRAMS = \
>   modeprint
>  
>  modeprint_SOURCES = \
> diff --git a/tests/modetest/Makefile.am b/tests/modetest/Makefile.am
> index 2191242..dd43d0c 100644
> --- a/tests/modetest/Makefile.am
> +++ b/tests/modetest/Makefile.am
> @@ -4,7 +4,7 @@ AM_CFLAGS = \
>   -I$(top_srcdir) \
>   $(CAIRO_CFLAGS)
>  
> -noinst_PROGRAMS = \
> +check_PROGRAMS = \
>   modetest
>  
>  modetest_SOURCES = \
> diff --git a/tests/radeon/Makefile.am b/tests/radeon/Makefile.am
> index 1775669..d4f6755 100644
> --- a/tests/radeon/Makefile.am
> +++ b/tests/radeon/Makefile.am
> @@ -4,7 +4,7 @@ AM_CFLAGS = \
>  
>  LDADD = $(top_builddir)/libdrm.la
>  
> -noinst_PROGRAMS = \
> +check_PROGRAMS = \
>   radeon_ttm
>  
>  radeon_ttm_SOURCES = \
> diff --git a/tests/vbltest/Makefile.am b/tests/vbltest/Makefile.am
> index 77f9037..5886bd1 100644
> --- a/tests/vbltest/Makefile.am
> +++ b/tests/vbltest/Makefile.am
> @@ -2,7 +2,7 @@ AM_CFLAGS = \
>   -I$(top_srcdir)/include/drm \
>   -I$(top_srcdir)
>  
> -noinst_PROGRAMS = \
> +check_PROGRAMS = \
>   vbltest
>  
>  vbltest_SOURCES = \
> --
> 1.7.3.4
> 
> ___
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
> 
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] Only build test programs with make check

2012-03-17 Thread Matt Turner
On Sat, Mar 17, 2012 at 2:35 PM, Jakob Bornecrantz  wrote:
> With my limited knowledge of automake I'm going to have
> to NACK this patch. Most of these programs are used during
> driver bring up to test things out, often we also modify
> them a bit to suit our need.
>
> If this patch doesn't make this any harder then
> disregard my NACK.

Harder, like having to run make check?

I'm not sure about modeprint, so I can drop that, but the rest look
like programs you'd want to run during make check.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)

2012-03-17 Thread Keith Packard
<#part sign=pgpmime>
On Sat, 17 Mar 2012 07:41:14 +, Dave Airlie  wrote:

> We've had reports on 965GM in Fedora, maybe davej has the specific
> bug.

Are these reports relatively recent, leading to a possible software bug
introduced in the last couple of releases?

-- 
keith.pack...@intel.com
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)

2012-03-17 Thread Dave Airlie
On Sat, Mar 17, 2012 at 6:59 PM, Keith Packard  wrote:
> <#part sign=pgpmime>
> On Sat, 17 Mar 2012 07:41:14 +, Dave Airlie  wrote:
>
>> We've had reports on 965GM in Fedora, maybe davej has the specific
>> bug.
>
> Are these reports relatively recent, leading to a possible software bug
> introduced in the last couple of releases?

No I think they are since 0-day KMS reports.

but its St Patricks day so really not a reliable info source for a
couple of days :-)

Dave.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] RFC: dma-buf: userspace mmap support

2012-03-17 Thread Alan Cox
> > dma-buf file descriptor.  Userspace access to the buffer should be
> > bracketed with DMA_BUF_IOCTL_{PREPARE,FINISH}_ACCESS ioctl calls to
> > give the exporting driver a chance to deal with cache synchronization
> > and such for cached userspace mappings without resorting to page

There should be flags indicating if this is necessary. We don't want extra
syscalls on hardware that doesn't need it. The other question is what
info is needed as you may only want to poke a few pages out of cache and
the prepare/finish on its own gives no info.

> E.g. If another device was writing to the buffer, the prepare ioctl
> could block until that device had finished accessing that buffer.

How do you avoid deadlocks on this ? We need very clear ways to ensure
things always complete in some form given multiple buffer
owner/requestors and the fact this API has no "prepare-multiple-buffers"
support.

Alan
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)

2012-03-17 Thread Dave Airlie
On Sat, Mar 17, 2012 at 7:20 PM, Dave Airlie  wrote:
> On Sat, Mar 17, 2012 at 6:59 PM, Keith Packard  wrote:
>> <#part sign=pgpmime>
>> On Sat, 17 Mar 2012 07:41:14 +, Dave Airlie  wrote:
>>
>>> We've had reports on 965GM in Fedora, maybe davej has the specific
>>> bug.
>>
>> Are these reports relatively recent, leading to a possible software bug
>> introduced in the last couple of releases?
>
> No I think they are since 0-day KMS reports.
>
> but its St Patricks day so really not a reliable info source for a
> couple of days :-)

I did however get a flashback in google to this:

http://lists.fedoraproject.org/pipermail/scm-commits/2010-July/456636.html

Linus don't think we ever did work out why that worked, I wonder if we
lost something after that.
Dave.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)

2012-03-17 Thread Dave Airlie
Okay found the 965GM report

https://bugzilla.kernel.org/show_bug.cgi?id=37142

Dave.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)

2012-03-17 Thread Linus Torvalds
On Sat, Mar 17, 2012 at 1:23 PM, Dave Airlie  wrote:
>
> I did however get a flashback in google to this:
>
> http://lists.fedoraproject.org/pipermail/scm-commits/2010-July/456636.html
>
> Linus don't think we ever did work out why that worked, I wonder if we
> lost something after that.

Hmm. Maybe we should stop making the default gfp_mask be
GFP_HIGHMEM_MOVABLE (see inode_init_always() in fs/inode.c), and
instead add the _MOVABLE flag only for mappings that use
"generic_file_mmap()".

It does sound a bit scary to just make default mappings use MOVABLE
pages, when we know that can be incorrect for some cases.

But that would require that filesystems like ext4 etc that don't just
use the generic_file_mmap() function would have add do that MOVABLE
thing on their own. And the *normal* case is certainly that pages are
movable and putting them in themovable zone should be ok. It's just
*not* ok if you also map them into some hardware GTT thing or just
otherwise keep track of the pages directly, like DRI does.

The other possibility is to just make this a shm thing, since it's
generally that layer that has the case of "pages allocated with the
mapping gfp masks". Hugh Dickins clearly tried to make sure that the
DRM initialization of the gfp mask was honored, but maybe there is
some case where it is missed. Hugh added a mapping_set_gfp_mask() to
i915_gem_alloc_object(), but maybe we have i915 shmem mappings that
get set up through some other path.

What about the ttm swap storage thing, for example? That also does
shmem_file_setup(), without limiting the pages. But I don't think i915
uses ttm, right?

Al? Hugh? Opinions? Right now anybody who uses the
shmem_read_mapping_page() function is in danger of getting a MOVABLE
page by default. Which may or may not be right.

That said, I don't see how i915 would get a movable page. It *seems*
to do the right setup for the gfp flags of the mapping.

 Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] RFC: dma-buf: userspace mmap support

2012-03-17 Thread Rob Clark
On Sat, Mar 17, 2012 at 3:17 PM, Alan Cox  wrote:
>> > dma-buf file descriptor.  Userspace access to the buffer should be
>> > bracketed with DMA_BUF_IOCTL_{PREPARE,FINISH}_ACCESS ioctl calls to
>> > give the exporting driver a chance to deal with cache synchronization
>> > and such for cached userspace mappings without resorting to page
>
> There should be flags indicating if this is necessary. We don't want extra
> syscalls on hardware that doesn't need it. The other question is what
> info is needed as you may only want to poke a few pages out of cache and
> the prepare/finish on its own gives no info.

Well, there isn't really a convenient way to know, for some random
code that is just handed a dmabuf fd, what the flags are without
passing around an extra param in userspace.  So I'd tend to say, just
live with the syscall even if it is a no-op (because if you are doing
sw access to the buffer, that is anyways some slow/legacy path).  But
I'm open to suggestions.

As far as just peeking/poking a few pages, that is where some later
ioctls or additional params could come in, to give some hints.  But I
wanted to keep it simple to start.

>> E.g. If another device was writing to the buffer, the prepare ioctl
>> could block until that device had finished accessing that buffer.
>
> How do you avoid deadlocks on this ? We need very clear ways to ensure
> things always complete in some form given multiple buffer
> owner/requestors and the fact this API has no "prepare-multiple-buffers"
> support.

Probably some separate synchronization is needed.. I'm not really sure
if prepare/finish (or map/unmap on kernel side) is a the right way to
handle that.

BR,
-R

> Alan
> ___
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)

2012-03-17 Thread Linus Torvalds
On Sat, Mar 17, 2012 at 3:09 PM, Hugh Dickins  wrote:
>
> I've got to go out for an hour: I'll digest more and think more about
> this when I get back.  If someone could explain the original problem
> with _MOVABLE, that would help me:

I do not believe we actually ever uncovered the original problem with
_MOVABLE: the problem was bisected down to the stable-backported
version of commit 4bdadb978569 ("drm/i915: Selectively enable
self-reclaim"), and I looked at the changes and decided that one of
the main ones was the removal of the mapping_set_gfp_mask() - which
resulted in __GFP_MOVABLE being on for the mapping.

So we just tried re-introducing that, and it fixed the problem.

Exactly *why* movable pages are a problem was never all that clear.
The assumption was (and I guess still is) that there are physical
pointers to the pages in the DRM pages themselves. Quoting from the
original thread:

 "I could easily see that something would get very unhappy and corrupt
  memory if the suspend-to-disk phase ends up compacting memory and
  moving the pages around from under the i915 driver."

but I didn't actually see why the i915 page pinning would be defeated
by __GFP_MOVABLE. The code does get a reference to them afaik.

So no, we don't have a real first cause. We just have a band-aid for
the symptom. It might be a pure bug in hibernation that just assumes
that any pages in MOVABLE zones can always be moved without even
checking any refcounts, so not using __GFP_MOVABLE might just be
working around problems elsewhere.

Or it's entirely possible that the DRM layer does *not* increment the
refcounts of the pages properly, and that hibernate with movable pages
is just a very efficient way of showing the problems that results in.

> A factor which might turn out to be relevant: swapin_readahead() (or
> swapoff) brings in pages from swap before it knows who they belong to,
> so the swapped-in swapcache pages don't necessarily have the limitations
> mapping_set_gfp_mask() has asked for.  It's been discussed with Alan,
> we know it will need fixing for gma500 when eventually that comes to
> be used on machines with more than 4GB, but for now I wasn't aware of
> a problem.

So i915 shouldn't much care about highmem vs non-highmem, it's just
that it does hold on to pages and take their physical address. But as
far as I can tell, whenever the physical address is in use, it does
have a refcount to the page.

So for example, i915_gem_object_get_pages_gtt() will use
shmem_read_mapping_page_gfp() which will increment the page count for
the page it gets, so all the obj->pages[] entries should have properly
incremented page counts. And they get released by
i915_gem_object_put_pages_gtt(), but maybe that is called too early
while the pages are still in use by the GFX unit...

If there is a refcounting issue, hibernate + __GFP_MOVABLE might just
be a great way to see it.

I don't really know the code, though.

 Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)

2012-03-17 Thread Keith Packard
<#part sign=pgpmime>
On Sat, 17 Mar 2012 15:52:15 -0700, Linus Torvalds 
 wrote:

> I do not believe we actually ever uncovered the original problem with
> _MOVABLE: the problem was bisected down to the stable-backported
> version of commit 4bdadb978569 ("drm/i915: Selectively enable
> self-reclaim"), and I looked at the changes and decided that one of
> the main ones was the removal of the mapping_set_gfp_mask() - which
> resulted in __GFP_MOVABLE being on for the mapping.

Can anyone explain what __GFP_MOVABLE even does? I can't understand what
this flag would be for; if the page is locked (with get_page), then the
page cannot move. If it isn't locked, then it's subject to swapping, in
which case the page will almost certainly move when it returns from
disk. Is it that the page won't move if it isn't swapped? That doesn't
seem all that useful to me.

> but I didn't actually see why the i915 page pinning would be defeated
> by __GFP_MOVABLE. The code does get a reference to them afaik.

GTT mapping and page locking are done in lock-step in the driver:

i915_gem_object_bind_to_gtt
i915_gem_object_get_pages_gtt
pins the pages
i915_gem_gtt_bind_object
maps to GTT

i915_gem_object_unbind
i915_gem_gtt_unbind_object
unmaps from GTT
i915_gem_object_put_pages_gtt
unpins the pages.

There are no other code paths to unmapping objects from the GTT or
unpinning the pages that I can find.

> So for example, i915_gem_object_get_pages_gtt() will use
> shmem_read_mapping_page_gfp() which will increment the page count for
> the page it gets, so all the obj->pages[] entries should have properly
> incremented page counts. And they get released by
> i915_gem_object_put_pages_gtt(), but maybe that is called too early
> while the pages are still in use by the GFX unit...

This seems the most likely problem -- there are so many caches and
buffers involved. However, we're seeing troubles on hibernate resume, at
which point there isn't any acceleration going on, just two fbdev
drivers poking the hardware. Which really reduces the complexity quite a
bit; it's just CPU reads/writes through the GTT aperture created for the
two console frame buffers. That makes this an interesting place to look
for trouble; we can ignore vast areas within the driver that deal with
acceleration, at least for this case.

-- 
keith.pack...@intel.com
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [V5 PATCH 1/4] drivers-gpu-drm-allow-to-load-edid-firmware.patch

2012-03-17 Thread Valdis . Kletnieks
On Thu, 15 Mar 2012 15:56:24 BST, Carsten Emde said:
> Broken monitors and/or broken graphic boards may send erroneous or no
> EDID data. This also applies to broken KVM devices that are unable to
> correctly forward the EDID data of the connected monitor but invent
> their own fantasy data.

>  Documentation/EDID/HOWTO.txt|   39 +
>  Documentation/EDID/Makefile |   26 +++

Documented *and* a Makefile to automate it. Nice. :)  That addresses my
comments..

Acked-by: Valdis Kletnieks 


pgpsTPtB0bBud.pgp
Description: PGP signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)

2012-03-17 Thread Keith Packard
<#part sign=pgpmime>
On Sat, 17 Mar 2012 18:44:18 -0700 (PDT), Hugh Dickins  wrote:

> __GFP_MOVABLE is a hint to page allocation that there's a good likelihood
> that this logical page can be migrated elsewhere in physical memory later
> on if mm wants, so it's a good idea to allocate it from a physical area of
> similarly MOVABLE pages;

So, allocating things with __GFP_MOVABLE may just change where in memory
the graphics pages get allocated, moving whatever GPU-inspired damage to
less sensitive bits of the kernel data. Sounds fabulous!

> I keep worrying about the sequence when the machine is powered on again
> after hibernation: can i915 get up to anything before it is resumed from
> the hibernation image?

Well, the frame buffer is presumably still using whatever mapping it had
before suspend occurred; is there any way it could be writing through
that before the graphics driver was resumed?

What I don't understand is the relationship between the boot kernel and
the resumed kernel; when does the boot kernel stop writing to the
console, and how does it hand off control of the frame buffer at that
time.

It would be great if we could separate out the boot kernel access to the
graphics system from the resumed system -- if the boot kernel was run
without the i915 driver loaded at all, and just used VGA text mode, then
any damage as a result of resume wouldn't be caused by the boot kernel
GTT mappings getting used at the wrong time.

-- 
keith.pack...@intel.com
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Linaro-mm-sig] [PATCH] dma-buf: add get_dma_buf()

2012-03-17 Thread Kyungmin Park
Reviewed-by: Kyungmin Park 

On 3/17/12, Rob Clark  wrote:
> From: Rob Clark 
>
> Works in a similar way to get_file(), and is needed in cases such as
> when the exporter needs to also keep a reference to the dmabuf (that
> is later released with a dma_buf_put()), and possibly other similar
> cases.
>
> Signed-off-by: Rob Clark 
> ---
> Minor update on original to add a missing #include
>
>  include/linux/dma-buf.h |   15 +++
>  1 files changed, 15 insertions(+), 0 deletions(-)
>
> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> index 891457a..bc4203dc 100644
> --- a/include/linux/dma-buf.h
> +++ b/include/linux/dma-buf.h
> @@ -30,6 +30,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  struct dma_buf;
>  struct dma_buf_attachment;
> @@ -110,6 +111,20 @@ struct dma_buf_attachment {
>   void *priv;
>  };
>
> +/**
> + * get_dma_buf - convenience wrapper for get_file.
> + * @dmabuf:  [in]pointer to dma_buf
> + *
> + * Increments the reference count on the dma-buf, needed in case of drivers
> + * that either need to create additional references to the dmabuf on the
> + * kernel side.  For example, an exporter that needs to keep a dmabuf ptr
> + * so that subsequent exports don't create a new dmabuf.
> + */
> +static inline void get_dma_buf(struct dma_buf *dmabuf)
> +{
> + get_file(dmabuf->file);
> +}
> +
>  #ifdef CONFIG_DMA_SHARED_BUFFER
>  struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
>   struct device *dev);
> --
> 1.7.5.4
>
>
> ___
> Linaro-mm-sig mailing list
> Linaro-mm-sig at lists.linaro.org
> http://lists.linaro.org/mailman/listinfo/linaro-mm-sig
>


i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)

2012-03-17 Thread Dave Airlie
On Fri, Mar 16, 2012 at 11:01 PM, Keith Packard  wrote:
> <#part sign=pgpmime>
> On Fri, 16 Mar 2012 16:47:46 +, Dave Airlie  wrote:
>
>> The hibernate issue is known and I've been hoping someone from Intel
>> would run with debugging it, they have a big enough team that I don't
>> feel I can expend the personal time to look into it.
>
> Yeah, I've been chatting with a couple of intel folks; we should have a
> test patch ready shortly, but we haven't been able to reproduce anything
> like this...
>
>> Maybe Keith can push someone or maybe I just refuse pull requests
>> until one with a fix appears.
>
> From what I've seen, this is a problem only on Ironlake machines, which
> makes me afraid of some weird GTT flushing issue, given the adventures
> we had with VT-d on that hardware where we idle the gpu before any GTT
> updates. I wonder what would happen if we idled the GPU before any GTT
> updates even when VT-d wasn't running...

We've had reports on 965GM in Fedora, maybe davej has the specific bug.

Dave.


[Bug 45366] Radeon gpu lockups

2012-03-17 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=45366

--- Comment #5 from Ernst Sj?strand  2012-03-17 03:39:47 
PDT ---
This might be related to reclocking the GPU because when I tried dynpm it
happened more often and it often happens when waking the computer up from DPMS.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[Bug 29412] fans running at full-speed after resume from suspend with radeon and KMS

2012-03-17 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=29412





--- Comment #4 from Jon Dowland   
2012-03-17 11:27:26 ---
Ok retesting v2.6.32-rc1 first, I had trouble building it (gcc 4.6 won't work);
trouble getting the initramfs to work, and trouble getting X to start (some
kind of race condition / bad failure mode in the gdm3 init script launched ~300
Xorg instances before I killed it).

However, I could reproduce the issue without X: single-user mode, no radeon
module loaded: fans climbed down as they should. Single-user mode, radeon
driver loaded: fans didn't climb down.

I'll check v2.6.32-rc1^ next, then I'll collect dmesg/Xorg.0.logs as
appropriate.

(I confirmed this with 3.3~rc6-1~experimental.1 in the mean time)

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.


[Bug 29412] fans running at full-speed after resume from suspend with radeon and KMS

2012-03-17 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=29412





--- Comment #5 from Jon Dowland   
2012-03-17 11:49:13 ---
Created an attachment (id=72633)
 --> (https://bugzilla.kernel.org/attachment.cgi?id=72633)
dmesg from v2.6.32-rc1^

Frustratingly, the issue reared its head with v2.6.32-rc1^.  This dmesg is from
a single user mode session, two suspends with "modprobe radeon" in the middle.

It's getting hard to run kernels this old (even though they aren't that old!)
notice the error loading the radeon firmware: no idea why it hasn't managed,
other kernels do (and it's there).  Also udev refuses to start with < 2.6.32.

I may have to set up a test root with an older userspace (squeeze perhaps) to
go further :(

Does this basically invalidate my previous bisect?

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.


[PATCH] Only build test programs with make check

2012-03-17 Thread Matt Turner
Signed-off-by: Matt Turner 
---
 tests/kmstest/Makefile.am   |2 +-
 tests/modeprint/Makefile.am |2 +-
 tests/modetest/Makefile.am  |2 +-
 tests/radeon/Makefile.am|2 +-
 tests/vbltest/Makefile.am   |2 +-
 5 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/tests/kmstest/Makefile.am b/tests/kmstest/Makefile.am
index ae562a1..10b9ef3 100644
--- a/tests/kmstest/Makefile.am
+++ b/tests/kmstest/Makefile.am
@@ -3,7 +3,7 @@ AM_CFLAGS = \
-I$(top_srcdir)/libkms/ \
-I$(top_srcdir)

-noinst_PROGRAMS = \
+check_PROGRAMS = \
kmstest

 kmstest_SOURCES = \
diff --git a/tests/modeprint/Makefile.am b/tests/modeprint/Makefile.am
index c4862ac..4291dc5 100644
--- a/tests/modeprint/Makefile.am
+++ b/tests/modeprint/Makefile.am
@@ -2,7 +2,7 @@ AM_CFLAGS = \
-I$(top_srcdir)/include/drm \
-I$(top_srcdir)

-noinst_PROGRAMS = \
+check_PROGRAMS = \
modeprint

 modeprint_SOURCES = \
diff --git a/tests/modetest/Makefile.am b/tests/modetest/Makefile.am
index 2191242..dd43d0c 100644
--- a/tests/modetest/Makefile.am
+++ b/tests/modetest/Makefile.am
@@ -4,7 +4,7 @@ AM_CFLAGS = \
-I$(top_srcdir) \
$(CAIRO_CFLAGS)

-noinst_PROGRAMS = \
+check_PROGRAMS = \
modetest

 modetest_SOURCES = \
diff --git a/tests/radeon/Makefile.am b/tests/radeon/Makefile.am
index 1775669..d4f6755 100644
--- a/tests/radeon/Makefile.am
+++ b/tests/radeon/Makefile.am
@@ -4,7 +4,7 @@ AM_CFLAGS = \

 LDADD = $(top_builddir)/libdrm.la

-noinst_PROGRAMS = \
+check_PROGRAMS = \
radeon_ttm

 radeon_ttm_SOURCES = \
diff --git a/tests/vbltest/Makefile.am b/tests/vbltest/Makefile.am
index 77f9037..5886bd1 100644
--- a/tests/vbltest/Makefile.am
+++ b/tests/vbltest/Makefile.am
@@ -2,7 +2,7 @@ AM_CFLAGS = \
-I$(top_srcdir)/include/drm \
-I$(top_srcdir)

-noinst_PROGRAMS = \
+check_PROGRAMS = \
vbltest

 vbltest_SOURCES = \
-- 
1.7.3.4



[PATCH] Only build test programs with make check

2012-03-17 Thread Jakob Bornecrantz
With my limited knowledge of automake I'm going to have
to NACK this patch. Most of these programs are used during
driver bring up to test things out, often we also modify
them a bit to suit our need.

If this patch doesn't make this any harder then
disregard my NACK.

Cheers Jakob.

- Original Message -
> Signed-off-by: Matt Turner 
> ---
>  tests/kmstest/Makefile.am   |2 +-
>  tests/modeprint/Makefile.am |2 +-
>  tests/modetest/Makefile.am  |2 +-
>  tests/radeon/Makefile.am|2 +-
>  tests/vbltest/Makefile.am   |2 +-
>  5 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/tests/kmstest/Makefile.am b/tests/kmstest/Makefile.am
> index ae562a1..10b9ef3 100644
> --- a/tests/kmstest/Makefile.am
> +++ b/tests/kmstest/Makefile.am
> @@ -3,7 +3,7 @@ AM_CFLAGS = \
>   -I$(top_srcdir)/libkms/ \
>   -I$(top_srcdir)
>  
> -noinst_PROGRAMS = \
> +check_PROGRAMS = \
>   kmstest
>  
>  kmstest_SOURCES = \
> diff --git a/tests/modeprint/Makefile.am
> b/tests/modeprint/Makefile.am
> index c4862ac..4291dc5 100644
> --- a/tests/modeprint/Makefile.am
> +++ b/tests/modeprint/Makefile.am
> @@ -2,7 +2,7 @@ AM_CFLAGS = \
>   -I$(top_srcdir)/include/drm \
>   -I$(top_srcdir)
>  
> -noinst_PROGRAMS = \
> +check_PROGRAMS = \
>   modeprint
>  
>  modeprint_SOURCES = \
> diff --git a/tests/modetest/Makefile.am b/tests/modetest/Makefile.am
> index 2191242..dd43d0c 100644
> --- a/tests/modetest/Makefile.am
> +++ b/tests/modetest/Makefile.am
> @@ -4,7 +4,7 @@ AM_CFLAGS = \
>   -I$(top_srcdir) \
>   $(CAIRO_CFLAGS)
>  
> -noinst_PROGRAMS = \
> +check_PROGRAMS = \
>   modetest
>  
>  modetest_SOURCES = \
> diff --git a/tests/radeon/Makefile.am b/tests/radeon/Makefile.am
> index 1775669..d4f6755 100644
> --- a/tests/radeon/Makefile.am
> +++ b/tests/radeon/Makefile.am
> @@ -4,7 +4,7 @@ AM_CFLAGS = \
>  
>  LDADD = $(top_builddir)/libdrm.la
>  
> -noinst_PROGRAMS = \
> +check_PROGRAMS = \
>   radeon_ttm
>  
>  radeon_ttm_SOURCES = \
> diff --git a/tests/vbltest/Makefile.am b/tests/vbltest/Makefile.am
> index 77f9037..5886bd1 100644
> --- a/tests/vbltest/Makefile.am
> +++ b/tests/vbltest/Makefile.am
> @@ -2,7 +2,7 @@ AM_CFLAGS = \
>   -I$(top_srcdir)/include/drm \
>   -I$(top_srcdir)
>  
> -noinst_PROGRAMS = \
> +check_PROGRAMS = \
>   vbltest
>  
>  vbltest_SOURCES = \
> --
> 1.7.3.4
> 
> ___
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
> 


[PATCH] Only build test programs with make check

2012-03-17 Thread Matt Turner
On Sat, Mar 17, 2012 at 2:35 PM, Jakob Bornecrantz  wrote:
> With my limited knowledge of automake I'm going to have
> to NACK this patch. Most of these programs are used during
> driver bring up to test things out, often we also modify
> them a bit to suit our need.
>
> If this patch doesn't make this any harder then
> disregard my NACK.

Harder, like having to run make check?

I'm not sure about modeprint, so I can drop that, but the rest look
like programs you'd want to run during make check.


i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)

2012-03-17 Thread Keith Packard
<#part sign=pgpmime>
On Sat, 17 Mar 2012 07:41:14 +, Dave Airlie  wrote:

> We've had reports on 965GM in Fedora, maybe davej has the specific
> bug.

Are these reports relatively recent, leading to a possible software bug
introduced in the last couple of releases?

-- 
keith.packard at intel.com


i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)

2012-03-17 Thread Dave Airlie
On Sat, Mar 17, 2012 at 6:59 PM, Keith Packard  wrote:
> <#part sign=pgpmime>
> On Sat, 17 Mar 2012 07:41:14 +, Dave Airlie  wrote:
>
>> We've had reports on 965GM in Fedora, maybe davej has the specific
>> bug.
>
> Are these reports relatively recent, leading to a possible software bug
> introduced in the last couple of releases?

No I think they are since 0-day KMS reports.

but its St Patricks day so really not a reliable info source for a
couple of days :-)

Dave.


[PATCH] RFC: dma-buf: userspace mmap support

2012-03-17 Thread Alan Cox
> > dma-buf file descriptor.  Userspace access to the buffer should be
> > bracketed with DMA_BUF_IOCTL_{PREPARE,FINISH}_ACCESS ioctl calls to
> > give the exporting driver a chance to deal with cache synchronization
> > and such for cached userspace mappings without resorting to page

There should be flags indicating if this is necessary. We don't want extra
syscalls on hardware that doesn't need it. The other question is what
info is needed as you may only want to poke a few pages out of cache and
the prepare/finish on its own gives no info.

> E.g. If another device was writing to the buffer, the prepare ioctl
> could block until that device had finished accessing that buffer.

How do you avoid deadlocks on this ? We need very clear ways to ensure
things always complete in some form given multiple buffer
owner/requestors and the fact this API has no "prepare-multiple-buffers"
support.

Alan


i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)

2012-03-17 Thread Dave Airlie
On Sat, Mar 17, 2012 at 7:20 PM, Dave Airlie  wrote:
> On Sat, Mar 17, 2012 at 6:59 PM, Keith Packard  wrote:
>> <#part sign=pgpmime>
>> On Sat, 17 Mar 2012 07:41:14 +, Dave Airlie  wrote:
>>
>>> We've had reports on 965GM in Fedora, maybe davej has the specific
>>> bug.
>>
>> Are these reports relatively recent, leading to a possible software bug
>> introduced in the last couple of releases?
>
> No I think they are since 0-day KMS reports.
>
> but its St Patricks day so really not a reliable info source for a
> couple of days :-)

I did however get a flashback in google to this:

http://lists.fedoraproject.org/pipermail/scm-commits/2010-July/456636.html

Linus don't think we ever did work out why that worked, I wonder if we
lost something after that.
Dave.


i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)

2012-03-17 Thread Dave Airlie
Okay found the 965GM report

https://bugzilla.kernel.org/show_bug.cgi?id=37142

Dave.


i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)

2012-03-17 Thread Linus Torvalds
On Sat, Mar 17, 2012 at 1:23 PM, Dave Airlie  wrote:
>
> I did however get a flashback in google to this:
>
> http://lists.fedoraproject.org/pipermail/scm-commits/2010-July/456636.html
>
> Linus don't think we ever did work out why that worked, I wonder if we
> lost something after that.

Hmm. Maybe we should stop making the default gfp_mask be
GFP_HIGHMEM_MOVABLE (see inode_init_always() in fs/inode.c), and
instead add the _MOVABLE flag only for mappings that use
"generic_file_mmap()".

It does sound a bit scary to just make default mappings use MOVABLE
pages, when we know that can be incorrect for some cases.

But that would require that filesystems like ext4 etc that don't just
use the generic_file_mmap() function would have add do that MOVABLE
thing on their own. And the *normal* case is certainly that pages are
movable and putting them in themovable zone should be ok. It's just
*not* ok if you also map them into some hardware GTT thing or just
otherwise keep track of the pages directly, like DRI does.

The other possibility is to just make this a shm thing, since it's
generally that layer that has the case of "pages allocated with the
mapping gfp masks". Hugh Dickins clearly tried to make sure that the
DRM initialization of the gfp mask was honored, but maybe there is
some case where it is missed. Hugh added a mapping_set_gfp_mask() to
i915_gem_alloc_object(), but maybe we have i915 shmem mappings that
get set up through some other path.

What about the ttm swap storage thing, for example? That also does
shmem_file_setup(), without limiting the pages. But I don't think i915
uses ttm, right?

Al? Hugh? Opinions? Right now anybody who uses the
shmem_read_mapping_page() function is in danger of getting a MOVABLE
page by default. Which may or may not be right.

That said, I don't see how i915 would get a movable page. It *seems*
to do the right setup for the gfp flags of the mapping.

 Linus


[PATCH] RFC: dma-buf: userspace mmap support

2012-03-17 Thread Rob Clark
On Sat, Mar 17, 2012 at 3:17 PM, Alan Cox  wrote:
>> > dma-buf file descriptor. ?Userspace access to the buffer should be
>> > bracketed with DMA_BUF_IOCTL_{PREPARE,FINISH}_ACCESS ioctl calls to
>> > give the exporting driver a chance to deal with cache synchronization
>> > and such for cached userspace mappings without resorting to page
>
> There should be flags indicating if this is necessary. We don't want extra
> syscalls on hardware that doesn't need it. The other question is what
> info is needed as you may only want to poke a few pages out of cache and
> the prepare/finish on its own gives no info.

Well, there isn't really a convenient way to know, for some random
code that is just handed a dmabuf fd, what the flags are without
passing around an extra param in userspace.  So I'd tend to say, just
live with the syscall even if it is a no-op (because if you are doing
sw access to the buffer, that is anyways some slow/legacy path).  But
I'm open to suggestions.

As far as just peeking/poking a few pages, that is where some later
ioctls or additional params could come in, to give some hints.  But I
wanted to keep it simple to start.

>> E.g. If another device was writing to the buffer, the prepare ioctl
>> could block until that device had finished accessing that buffer.
>
> How do you avoid deadlocks on this ? We need very clear ways to ensure
> things always complete in some form given multiple buffer
> owner/requestors and the fact this API has no "prepare-multiple-buffers"
> support.

Probably some separate synchronization is needed.. I'm not really sure
if prepare/finish (or map/unmap on kernel side) is a the right way to
handle that.

BR,
-R

> Alan
> ___
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel


i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)

2012-03-17 Thread Linus Torvalds
On Sat, Mar 17, 2012 at 3:09 PM, Hugh Dickins  wrote:
>
> I've got to go out for an hour: I'll digest more and think more about
> this when I get back. ?If someone could explain the original problem
> with _MOVABLE, that would help me:

I do not believe we actually ever uncovered the original problem with
_MOVABLE: the problem was bisected down to the stable-backported
version of commit 4bdadb978569 ("drm/i915: Selectively enable
self-reclaim"), and I looked at the changes and decided that one of
the main ones was the removal of the mapping_set_gfp_mask() - which
resulted in __GFP_MOVABLE being on for the mapping.

So we just tried re-introducing that, and it fixed the problem.

Exactly *why* movable pages are a problem was never all that clear.
The assumption was (and I guess still is) that there are physical
pointers to the pages in the DRM pages themselves. Quoting from the
original thread:

 "I could easily see that something would get very unhappy and corrupt
  memory if the suspend-to-disk phase ends up compacting memory and
  moving the pages around from under the i915 driver."

but I didn't actually see why the i915 page pinning would be defeated
by __GFP_MOVABLE. The code does get a reference to them afaik.

So no, we don't have a real first cause. We just have a band-aid for
the symptom. It might be a pure bug in hibernation that just assumes
that any pages in MOVABLE zones can always be moved without even
checking any refcounts, so not using __GFP_MOVABLE might just be
working around problems elsewhere.

Or it's entirely possible that the DRM layer does *not* increment the
refcounts of the pages properly, and that hibernate with movable pages
is just a very efficient way of showing the problems that results in.

> A factor which might turn out to be relevant: swapin_readahead() (or
> swapoff) brings in pages from swap before it knows who they belong to,
> so the swapped-in swapcache pages don't necessarily have the limitations
> mapping_set_gfp_mask() has asked for. ?It's been discussed with Alan,
> we know it will need fixing for gma500 when eventually that comes to
> be used on machines with more than 4GB, but for now I wasn't aware of
> a problem.

So i915 shouldn't much care about highmem vs non-highmem, it's just
that it does hold on to pages and take their physical address. But as
far as I can tell, whenever the physical address is in use, it does
have a refcount to the page.

So for example, i915_gem_object_get_pages_gtt() will use
shmem_read_mapping_page_gfp() which will increment the page count for
the page it gets, so all the obj->pages[] entries should have properly
incremented page counts. And they get released by
i915_gem_object_put_pages_gtt(), but maybe that is called too early
while the pages are still in use by the GFX unit...

If there is a refcounting issue, hibernate + __GFP_MOVABLE might just
be a great way to see it.

I don't really know the code, though.

 Linus


i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)

2012-03-17 Thread Keith Packard
<#part sign=pgpmime>
On Sat, 17 Mar 2012 15:52:15 -0700, Linus Torvalds  wrote:

> I do not believe we actually ever uncovered the original problem with
> _MOVABLE: the problem was bisected down to the stable-backported
> version of commit 4bdadb978569 ("drm/i915: Selectively enable
> self-reclaim"), and I looked at the changes and decided that one of
> the main ones was the removal of the mapping_set_gfp_mask() - which
> resulted in __GFP_MOVABLE being on for the mapping.

Can anyone explain what __GFP_MOVABLE even does? I can't understand what
this flag would be for; if the page is locked (with get_page), then the
page cannot move. If it isn't locked, then it's subject to swapping, in
which case the page will almost certainly move when it returns from
disk. Is it that the page won't move if it isn't swapped? That doesn't
seem all that useful to me.

> but I didn't actually see why the i915 page pinning would be defeated
> by __GFP_MOVABLE. The code does get a reference to them afaik.

GTT mapping and page locking are done in lock-step in the driver:

i915_gem_object_bind_to_gtt
i915_gem_object_get_pages_gtt
pins the pages
i915_gem_gtt_bind_object
maps to GTT

i915_gem_object_unbind
i915_gem_gtt_unbind_object
unmaps from GTT
i915_gem_object_put_pages_gtt
unpins the pages.

There are no other code paths to unmapping objects from the GTT or
unpinning the pages that I can find.

> So for example, i915_gem_object_get_pages_gtt() will use
> shmem_read_mapping_page_gfp() which will increment the page count for
> the page it gets, so all the obj->pages[] entries should have properly
> incremented page counts. And they get released by
> i915_gem_object_put_pages_gtt(), but maybe that is called too early
> while the pages are still in use by the GFX unit...

This seems the most likely problem -- there are so many caches and
buffers involved. However, we're seeing troubles on hibernate resume, at
which point there isn't any acceleration going on, just two fbdev
drivers poking the hardware. Which really reduces the complexity quite a
bit; it's just CPU reads/writes through the GTT aperture created for the
two console frame buffers. That makes this an interesting place to look
for trouble; we can ignore vast areas within the driver that deal with
acceleration, at least for this case.

-- 
keith.packard at intel.com


[V5 PATCH 1/4] drivers-gpu-drm-allow-to-load-edid-firmware.patch

2012-03-17 Thread valdis.kletni...@vt.edu
On Thu, 15 Mar 2012 15:56:24 BST, Carsten Emde said:
> Broken monitors and/or broken graphic boards may send erroneous or no
> EDID data. This also applies to broken KVM devices that are unable to
> correctly forward the EDID data of the connected monitor but invent
> their own fantasy data.

>  Documentation/EDID/HOWTO.txt|   39 +
>  Documentation/EDID/Makefile |   26 +++

Documented *and* a Makefile to automate it. Nice. :)  That addresses my
comments..

Acked-by: Valdis Kletnieks 
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 865 bytes
Desc: not available
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20120317/dbda4b5b/attachment.pgp>


i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)

2012-03-17 Thread Keith Packard
<#part sign=pgpmime>
On Sat, 17 Mar 2012 18:44:18 -0700 (PDT), Hugh Dickins  
wrote:

> __GFP_MOVABLE is a hint to page allocation that there's a good likelihood
> that this logical page can be migrated elsewhere in physical memory later
> on if mm wants, so it's a good idea to allocate it from a physical area of
> similarly MOVABLE pages;

So, allocating things with __GFP_MOVABLE may just change where in memory
the graphics pages get allocated, moving whatever GPU-inspired damage to
less sensitive bits of the kernel data. Sounds fabulous!

> I keep worrying about the sequence when the machine is powered on again
> after hibernation: can i915 get up to anything before it is resumed from
> the hibernation image?

Well, the frame buffer is presumably still using whatever mapping it had
before suspend occurred; is there any way it could be writing through
that before the graphics driver was resumed?

What I don't understand is the relationship between the boot kernel and
the resumed kernel; when does the boot kernel stop writing to the
console, and how does it hand off control of the frame buffer at that
time.

It would be great if we could separate out the boot kernel access to the
graphics system from the resumed system -- if the boot kernel was run
without the i915 driver loaded at all, and just used VGA text mode, then
any damage as a result of resume wouldn't be caused by the boot kernel
GTT mappings getting used at the wrong time.

-- 
keith.packard at intel.com


[PATCH] drivers/gpu/drm/radeon/radeon_cs.c: eliminate possible double free

2012-03-17 Thread Julia Lawall
From: Julia Lawall 

The function radeon_cs_parser_init is only called from two places, in
drivers/gpu/drm/radeon/radeon_cs.c and drivers/gpu/drm/radeon/r600_cs.c.
In each case, if the call fails another function is called that frees all
of the kdata and dpage information in the chunks array.  So this
information should not be freed in radeon_cs_parser_init as well.

Signed-off-by: Julia Lawall 

---
 drivers/gpu/drm/radeon/radeon_cs.c |   16 ++--
 1 file changed, 2 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
b/drivers/gpu/drm/radeon/radeon_cs.c
index 9b4124e..d9d9f5a 100644
--- a/drivers/gpu/drm/radeon/radeon_cs.c
+++ b/drivers/gpu/drm/radeon/radeon_cs.c
@@ -243,20 +243,11 @@ int radeon_cs_parser_init(struct radeon_cs_parser *p, 
void *data)
if ((p->cs_flags & RADEON_CS_USE_VM) &&
!p->rdev->vm_manager.enabled) {
DRM_ERROR("VM not active on asic!\n");
-   if (p->chunk_relocs_idx != -1)
-   kfree(p->chunks[p->chunk_relocs_idx].kdata);
-   if (p->chunk_flags_idx != -1)
-   kfree(p->chunks[p->chunk_flags_idx].kdata);
return -EINVAL;
}

-   if (radeon_cs_get_ring(p, ring, priority)) {
-   if (p->chunk_relocs_idx != -1)
-   kfree(p->chunks[p->chunk_relocs_idx].kdata);
-   if (p->chunk_flags_idx != -1)
-   kfree(p->chunks[p->chunk_flags_idx].kdata);
+   if (radeon_cs_get_ring(p, ring, priority))
return -EINVAL;
-   }


/* deal with non-vm */
@@ -271,11 +262,8 @@ int radeon_cs_parser_init(struct radeon_cs_parser *p, void 
*data)
p->chunks[p->chunk_ib_idx].kpage[0] = kmalloc(PAGE_SIZE, 
GFP_KERNEL);
p->chunks[p->chunk_ib_idx].kpage[1] = kmalloc(PAGE_SIZE, 
GFP_KERNEL);
if (p->chunks[p->chunk_ib_idx].kpage[0] == NULL ||
-   p->chunks[p->chunk_ib_idx].kpage[1] == NULL) {
-   kfree(p->chunks[p->chunk_ib_idx].kpage[0]);
-   kfree(p->chunks[p->chunk_ib_idx].kpage[1]);
+   p->chunks[p->chunk_ib_idx].kpage[1] == NULL)
return -ENOMEM;
-   }
p->chunks[p->chunk_ib_idx].kpage_idx[0] = -1;
p->chunks[p->chunk_ib_idx].kpage_idx[1] = -1;
p->chunks[p->chunk_ib_idx].last_copied_page = -1;



[PATCH] drivers/gpu/drm/savage/savage_state.c: add missing kfree

2012-03-17 Thread Julia Lawall
From: Julia Lawall 

Most of the error handling code in this function frees the buffers
kcmd_addr, kvb_addr, and kbox_addr allocated at the beginning of this
function.  These two branches are changed to do the same.

Signed-off-by: Julia Lawall 

---
 drivers/gpu/drm/savage/savage_state.c |5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/savage/savage_state.c 
b/drivers/gpu/drm/savage/savage_state.c
index 8a3e315..031aaaf 100644
--- a/drivers/gpu/drm/savage/savage_state.c
+++ b/drivers/gpu/drm/savage/savage_state.c
@@ -1057,7 +1057,8 @@ int savage_bci_cmdbuf(struct drm_device *dev, void *data, 
struct drm_file *file_
DRM_ERROR("indexed drawing command extends "
  "beyond end of command buffer\n");
DMA_FLUSH();
-   return -EINVAL;
+   ret = -EINVAL;
+   goto done;
}
/* fall through */
case SAVAGE_CMD_DMA_PRIM:
@@ -1076,7 +1077,7 @@ int savage_bci_cmdbuf(struct drm_device *dev, void *data, 
struct drm_file *file_
  cmdbuf->vb_stride,
  cmdbuf->nbox, cmdbuf->box_addr);
if (ret != 0)
-   return ret;
+   goto done;
first_draw_cmd = NULL;
}
}



i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)

2012-03-17 Thread Hugh Dickins
On Sat, 17 Mar 2012, Linus Torvalds wrote:
> On Sat, Mar 17, 2012 at 1:23 PM, Dave Airlie  wrote:
> >
> > I did however get a flashback in google to this:
> >
> > http://lists.fedoraproject.org/pipermail/scm-commits/2010-July/456636.html
> >
> > Linus don't think we ever did work out why that worked, I wonder if we
> > lost something after that.
> 
> Hmm. Maybe we should stop making the default gfp_mask be
> GFP_HIGHMEM_MOVABLE (see inode_init_always() in fs/inode.c), and
> instead add the _MOVABLE flag only for mappings that use
> "generic_file_mmap()".
> 
> It does sound a bit scary to just make default mappings use MOVABLE
> pages, when we know that can be incorrect for some cases.
> 
> But that would require that filesystems like ext4 etc that don't just
> use the generic_file_mmap() function would have add do that MOVABLE
> thing on their own. And the *normal* case is certainly that pages are
> movable and putting them in themovable zone should be ok. It's just
> *not* ok if you also map them into some hardware GTT thing or just
> otherwise keep track of the pages directly, like DRI does.
> 
> The other possibility is to just make this a shm thing, since it's
> generally that layer that has the case of "pages allocated with the
> mapping gfp masks". Hugh Dickins clearly tried to make sure that the
> DRM initialization of the gfp mask was honored, but maybe there is
> some case where it is missed. Hugh added a mapping_set_gfp_mask() to
> i915_gem_alloc_object(), but maybe we have i915 shmem mappings that
> get set up through some other path.
> 
> What about the ttm swap storage thing, for example? That also does
> shmem_file_setup(), without limiting the pages. But I don't think i915
> uses ttm, right?
> 
> Al? Hugh? Opinions? Right now anybody who uses the
> shmem_read_mapping_page() function is in danger of getting a MOVABLE
> page by default. Which may or may not be right.
> 
> That said, I don't see how i915 would get a movable page. It *seems*
> to do the right setup for the gfp flags of the mapping.

Yes, i915_gem_alloc_object() does its own mapping_set_gfp_mask(mapping,
GFP_HIGHUSER | __GFP_RECLAIMABLE), which should be as good as ever.

I've got to go out for an hour: I'll digest more and think more about
this when I get back.  If someone could explain the original problem
with _MOVABLE, that would help me: I didn't see an explanation in the
patch or in the bugzilla.  I would expect using _MOVABLE for i915_gem
would render a block of otherwise movable pages immovable because of
the GEM pages pinned, but I wouldn't expect them to move in mysterious
ways - or are references held to the pages without them being pinned??.

A factor which might turn out to be relevant: swapin_readahead() (or
swapoff) brings in pages from swap before it knows who they belong to,
so the swapped-in swapcache pages don't necessarily have the limitations
mapping_set_gfp_mask() has asked for.  It's been discussed with Alan,
we know it will need fixing for gma500 when eventually that comes to
be used on machines with more than 4GB, but for now I wasn't aware of
a problem.

Hugh


i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)

2012-03-17 Thread Hugh Dickins
On Sat, 17 Mar 2012, Keith Packard wrote:
> On Sat, 17 Mar 2012 15:52:15 -0700, Linus Torvalds  linux-foundation.org> wrote:
> 
> > I do not believe we actually ever uncovered the original problem with
> > _MOVABLE: the problem was bisected down to the stable-backported
> > version of commit 4bdadb978569 ("drm/i915: Selectively enable
> > self-reclaim"), and I looked at the changes and decided that one of
> > the main ones was the removal of the mapping_set_gfp_mask() - which
> > resulted in __GFP_MOVABLE being on for the mapping.
> 
> Can anyone explain what __GFP_MOVABLE even does? I can't understand what
> this flag would be for; if the page is locked (with get_page), then the
> page cannot move. If it isn't locked, then it's subject to swapping, in
> which case the page will almost certainly move when it returns from
> disk. Is it that the page won't move if it isn't swapped? That doesn't
> seem all that useful to me.

__GFP_MOVABLE is a hint to page allocation that there's a good likelihood
that this logical page can be migrated elsewhere in physical memory later
on if mm wants, so it's a good idea to allocate it from a physical area of
similarly MOVABLE pages; then if later on someone wants a large contiguous
area for something (or wants to hot-unplug that memory), it should be easy
to clear the whole area out, moving existing pages elsewhere.  (I think
that's right: several questions come to me as I write it, but now is not
the time to research all those details.)  Page migration can only be done
later if it can account for all of page_count(page).

> 
> > but I didn't actually see why the i915 page pinning would be defeated
> > by __GFP_MOVABLE. The code does get a reference to them afaik.
> 
> GTT mapping and page locking are done in lock-step in the driver:
> 
> i915_gem_object_bind_to_gtt
> i915_gem_object_get_pages_gtt
> pins the pages
> i915_gem_gtt_bind_object
> maps to GTT
> 
> i915_gem_object_unbind
>   i915_gem_gtt_unbind_object
> unmaps from GTT
>   i915_gem_object_put_pages_gtt
> unpins the pages.
> 
> There are no other code paths to unmapping objects from the GTT or
> unpinning the pages that I can find.
> 
> > So for example, i915_gem_object_get_pages_gtt() will use
> > shmem_read_mapping_page_gfp() which will increment the page count for
> > the page it gets, so all the obj->pages[] entries should have properly
> > incremented page counts. And they get released by
> > i915_gem_object_put_pages_gtt(), but maybe that is called too early
> > while the pages are still in use by the GFX unit...
> 
> This seems the most likely problem -- there are so many caches and
> buffers involved. However, we're seeing troubles on hibernate resume, at
> which point there isn't any acceleration going on, just two fbdev
> drivers poking the hardware. Which really reduces the complexity quite a
> bit; it's just CPU reads/writes through the GTT aperture created for the
> two console frame buffers. That makes this an interesting place to look
> for trouble; we can ignore vast areas within the driver that deal with
> acceleration, at least for this case.

I keep worrying about the sequence when the machine is powered on again
after hibernation: can i915 get up to anything before it is resumed from
the hibernation image?  Get to use certain pages at that stage, then
continue to poke at them after the hibernation image is restored (which
changes the story of what pages are free and what are used for what):
lacking some kind of flush?

Hugh


i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)

2012-03-17 Thread Hugh Dickins
Added Rafael to the Cc: Rafael, we're pondering over one or more of these
recurrent threads about corruption after resume, seemingly related to i915.

On Sat, 17 Mar 2012, Keith Packard wrote:
> On Sat, 17 Mar 2012 18:44:18 -0700 (PDT), Hugh Dickins  
> wrote:
> > I keep worrying about the sequence when the machine is powered on again
> > after hibernation: can i915 get up to anything before it is resumed from
> > the hibernation image?
> 
> Well, the frame buffer is presumably still using whatever mapping it had
> before suspend occurred; is there any way it could be writing through
> that before the graphics driver was resumed?

It's hibernation restore here, so I don't think it could be using the
mapping from before hibernation until after resuming from hibernation
snapshot: it would be using the rebooting kernel's mapping until then.

> 
> What I don't understand is the relationship between the boot kernel and
> the resumed kernel; when does the boot kernel stop writing to the
> console, and how does it hand off control of the frame buffer at that
> time.

I believe the handoff point comes in the late initcall software_resume():
which loads the image and calls hibernation_restore -> resume_target_kernel
-> swsusp_arch_resume, which emerges into the restored hibernation image.

As a late initcall, I imagine some work has already been done via the
framebuffer, but I have no conception of what kind of mappings that
involves (would shmem objects come into it at all? and is that even
a relevant question, could enough damage be done without them?), nor
whether they're properly torn down before emerging into the hibernimage.

> 
> It would be great if we could separate out the boot kernel access to the
> graphics system from the resumed system -- if the boot kernel was run
> without the i915 driver loaded at all, and just used VGA text mode, then
> any damage as a result of resume wouldn't be caused by the boot kernel
> GTT mappings getting used at the wrong time.

But you're giving my worry more credence than it deserves there:
we don't have any evidence that this is where the problem lies,
that's just a suspicion of mine at the moment.

Hugh