date:20110413

Re: [PATCH] drm/i915: restore only the mode of this driver on lastclose (v2)

2011-04-13 Thread Chris Wilson

On Wed, 13 Apr 2011 09:35:55 +1000, Dave Airlie  wrote:
> From: Dave Airlie 
> 
> i915 calls the panic handler function on last close to reset the modes,
> however this is a really bad idea for multi-gpu machines, esp shareable
> gpus machines. So add a new entry point for the driver to just restore
> its own fbcon mode.
> 
> v2: move code into fb helper, fix panic code to block mode change on
> powered off GPUs.

2 bugs in one patch?  This could be split into 3 steps... ;-)

Aside from that, looks good.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Linux 2.6.39-rc3

2011-04-13 Thread Ingo Molnar


* Joerg Roedel  wrote:

> > > The problem does not happen with 2.6.38. I try to bisect this further 
> > > down to a commit. Alex, please let me know if you need any further 
> > > information.
> > 
> > If you can bisect it, that would be great.  Thanks,
> 
> Bisecting actually gave a very weird result. It points to
> 
>   d2137d5af4259f50c19addb8246a186c9ffac325
> 
> which is a merge-commit in the x86 tree. Even more weird is that this
> notebook is the only machine with these symptoms, all my other boxes are
> fine.
>
> During the bisect I tested commits from Yinghai which were good. It seems 
> like the problem appeared with the merge.

There's a similar looking bug being debugged here:

  https://bugzilla.kernel.org/show_bug.cgi?id=33012

Could you please send the before/after bootlog (in particular all memory init 
messages included) and your .config?

 before:  f005fe12b90c: x86-64: Move out cleanup higmap [_brk_end, _end) out of 
init_memory_mapping()
  after:  d2137d5af425: Merge branch 'linus' into x86/bootmem

I've Cc:-ed more people who might have an idea about it.

Thanks,

Ingo
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 34534] resolution 3840x1024 stopped to work on HD5850 after switch to 2.6.37 kernel

2011-04-13 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=34534

--- Comment #15 from Peter Hercek  2011-04-13 00:37:56 PDT 
---
Created an attachment (id=45562)
 --> (https://bugs.freedesktop.org/attachment.cgi?id=45562)
xrandr --verbose output on 2.6.38.2-vanilla (with 3840x1024 fixed using
radeonreg regset 0x770c 0x00020004)

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?

2011-04-13 Thread Gabriel Paubert

On Tue, Apr 12, 2011 at 07:29:22PM +0200, Michel Dänzer wrote:
> On Die, 2011-04-12 at 14:00 +0200, Gabriel Paubert wrote:
> > On Tue, Apr 12, 2011 at 01:46:10PM +0200, Michel Dänzer wrote:
> > > > 
> > > > With no_wb=1 the driver goes a bit further but the X server ends
> > > > up in an infinite ioctl loop and the logs are: 
> > > 
> > > Which ioctl does it loop on? Please provide the Xorg.0.log file as well.
> > 
> > From memory, the code was 0x64, which is DRM_RADEON_GEM_WAIT_IDLE.
> 
> Note that it's normal for this ioctl to be called every time before the
> GPU accessible pixmap memory is accessed by the CPU. Unless the ioctl
> always returns an error, this may not indicate a problem on its own. 

It seems to be an infinite loop, always returning EINTR because
of regular SIGALRM delivery.

> 
> 
> > The Xorg.0.log from the previous boot is attached.
> 
> I don't see any obvious problems in it. Can you describe the symptoms of
> the problem you're having with X a bit more?

Well, X is dead, or rather in an infinite ioctl loop as described  above.
IIRC, the display enters a power-down mode and there is nothing to see.

> 
> One thing I notice is that the X server/driver are rather oldish. Maybe
> you can try newer versions from testing, sid or even experimental to see
> if that makes any difference.

I lack time to do it until early May (being away for 2 weeks starting on 
Friday and busy on urgent things). I'm indeed Debian stable (Squeeze),
which is rather recent and the machine is about 2 1/2 years old.

Gabriel
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?

2011-04-13 Thread Gabriel Paubert

On Tue, Apr 12, 2011 at 01:46:10PM +0200, Michel Dänzer wrote:
> BTW, if your kernel contains commit
> 69a07f0b117a40fcc1a479358d8e1f41793617f2, can you try if reverting that
> helps?

My kernel is pristine 2.6.38 and does not include this commit
(was introduced before 2.6.39-rc1 according to gitk).

Gabriel
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 34534] resolution 3840x1024 stopped to work on HD5850 after switch to 2.6.37 kernel

2011-04-13 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=34534

--- Comment #16 from Peter Hercek  2011-04-13 01:37:03 PDT 
---
(In reply to comment #14)
> Does this patch help?

No, the image stays corrupted, I still need to do this to fix it:
# radeonreg regset 0x770c 0x00020004
OLD: 0x770c (770c)0x00010005 (65541)
NEW: 0x770c (770c)0x00010004 (65540)
#

I applied and tested the patch with 2.6.38.2-vanilla.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?

2011-04-13 Thread Benjamin Herrenschmidt

On Wed, 2011-04-13 at 09:59 +0200, Gabriel Paubert wrote:
> 
> Well, X is dead, or rather in an infinite ioctl loop as described
> above.
> IIRC, the display enters a power-down mode and there is nothing to
> see.

So basically the card crashed. There's about an infinite amount of
reasons why radeons do so, sometimes it has to do with them not liking
what you ate that day...

The only thing I can see that could be of use would be a bisect

Cheers,
Ben.

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?

2011-04-13 Thread Gabriel Paubert

On Wed, Apr 13, 2011 at 06:16:13PM +1000, Benjamin Herrenschmidt wrote:
> On Wed, 2011-04-13 at 09:59 +0200, Gabriel Paubert wrote:
> > 
> > Well, X is dead, or rather in an infinite ioctl loop as described
> > above.
> > IIRC, the display enters a power-down mode and there is nothing to
> > see.
> 
> So basically the card crashed. There's about an infinite amount of
> reasons why radeons do so, sometimes it has to do with them not liking
> what you ate that day...
> 
> The only thing I can see that could be of use would be a bisect

Bisecting for something which I have never got to work (radeon with
KMS) on this machine is something I don't know how to do...

Note that radeon without KMS also always ends up crashing, but it
may take hours. The only case where the machine works reliably is 
when glxinfo claims that it is using software rendering.

Regards,
Gabriel
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: small git lesson [Was: Re: Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?]

2011-04-13 Thread Gabriel Paubert

On Wed, Apr 13, 2011 at 10:59:14AM +0200, Andreas Schwab wrote:
> Uwe Kleine-König  writes:
> 
> > $ git name-rev --refs=refs/tags/v2.6\* 
> > 69a07f0b117a40fcc1a479358d8e1f41793617f2
> > 69a07f0b117a40fcc1a479358d8e1f41793617f2 tags/v2.6.39-rc2~3^2~43^2~4
> >
> > so it was introduced just before -rc2.
> 
> $ git tag --contains 69a07f0b117a40fcc1a479358d8e1f41793617f2
> v2.6.39-rc1
> v2.6.39-rc2
> 

So who is right? I think it was before rc1. 

Anyway I'm aware that there are other git commands, although for the option
details I often have to have a look at the man page.

However in this case the main reason to fire gitk was to have a quick look 
at the patch and its context, and simply reported the "Precedes" line 
in the display, which is 2.6.39-rc1. It also follow v2.6.37-rc2, which means
that it has been quite a long time outside the main tree.

Gabriel
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 35502] Regression: black screen with Radeon KMS in 2.6.38 (2.6.37.4 worked fine)

2011-04-13 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=35502

Michel Dänzer  changed:

   What|Removed |Added

 CC||br...@canonical.com

--- Comment #9 from Michel Dänzer  2011-04-13 04:45:42 PDT 
---
*** Bug 36007 has been marked as a duplicate of this bug. ***

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] Big Endian support for RV730 (Mesa r600)

2011-04-13 Thread Benjamin Herrenschmidt

On Tue, 2011-04-12 at 10:01 +0200, Cédric Cano wrote:
> Hi
> 
> Here you are a patch that adds big endian support for rv730 in r600 
> classic mesa driver. The BE modifications are almost the same as the DRM 
> / DDX driver modifications 
> (http://lists.freedesktop.org/archives/dri-devel/2011-February/008151.html).
> 
> I used the mesa-demos to test the driver status on big endian platform. 
> Nearly all demos renders the same as on Intel architecture. 
> Nevertheless, there are still some issues in glReadPixels (r600_blit) 
> with some formats. I can't figure out exactly what and when data must be 
> swapped (set_tex_resoures, set_render_target...). Review of the patch 
> would be greatly appreciated.
> 
> It seems that r600g will be the default for Mesa 7.11 so I'll try to 
> enable big endian support for Gallium now.

Cool stuff !

I'll try to test that one of these days on various ppc's

Cheers,
Ben.


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] Big Endian support for RV730 (Mesa r600)

2011-04-13 Thread Benjamin Herrenschmidt

On Wed, 2011-04-13 at 22:05 +1000, Benjamin Herrenschmidt wrote:
> On Tue, 2011-04-12 at 10:01 +0200, Cédric Cano wrote:
> > Hi
> > 
> > Here you are a patch that adds big endian support for rv730 in r600 
> > classic mesa driver. The BE modifications are almost the same as the DRM 
> > / DDX driver modifications 
> > (http://lists.freedesktop.org/archives/dri-devel/2011-February/008151.html).
> > 
> > I used the mesa-demos to test the driver status on big endian platform. 
> > Nearly all demos renders the same as on Intel architecture. 
> > Nevertheless, there are still some issues in glReadPixels (r600_blit) 
> > with some formats. I can't figure out exactly what and when data must be 
> > swapped (set_tex_resoures, set_render_target...). Review of the patch 
> > would be greatly appreciated.
> > 
> > It seems that r600g will be the default for Mesa 7.11 so I'll try to 
> > enable big endian support for Gallium now.
> 
> Cool stuff !
> 
> I'll try to test that one of these days on various ppc's

BTW. I see you used some FSL embedded board. Do you have your PCIe MMIO
space above 32-bit ? Last I looked, there was a bunch of fixing needing
to be done, among others in the TTM, to make that work.

I had some preliminary patches but they bitrot... mostly the issue is to
make sure than a phys_addr_t is used instead of an unsigned long
whenever it tries to store the physical address of an object.

Ben.
 

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?

2011-04-13 Thread Michel Dänzer

On Mit, 2011-04-13 at 09:59 +0200, Gabriel Paubert wrote: 
> On Tue, Apr 12, 2011 at 07:29:22PM +0200, Michel Dänzer wrote:
> > On Die, 2011-04-12 at 14:00 +0200, Gabriel Paubert wrote:
> > > On Tue, Apr 12, 2011 at 01:46:10PM +0200, Michel Dänzer wrote:
> > > > > 
> > > > > With no_wb=1 the driver goes a bit further but the X server ends
> > > > > up in an infinite ioctl loop and the logs are: 
> > > > 
> > > > Which ioctl does it loop on? Please provide the Xorg.0.log file as well.
> > > 
> > > From memory, the code was 0x64, which is DRM_RADEON_GEM_WAIT_IDLE.
> > 
> > Note that it's normal for this ioctl to be called every time before the
> > GPU accessible pixmap memory is accessed by the CPU. Unless the ioctl
> > always returns an error, this may not indicate a problem on its own. 
> 
> It seems to be an infinite loop, always returning EINTR because
> of regular SIGALRM delivery.

That does sound like the GPU locks up. Do you get any messages in dmesg
about lockups and attempts to reset the GPU at any time?


-- 
Earthling Michel Dänzer   |http://www.vmware.com
Libre software enthusiast |  Debian, X and DRI developer
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?

2011-04-13 Thread Gabriel Paubert

On Wed, Apr 13, 2011 at 02:12:16PM +0200, Michel Dänzer wrote:
> On Mit, 2011-04-13 at 09:59 +0200, Gabriel Paubert wrote: 
> > On Tue, Apr 12, 2011 at 07:29:22PM +0200, Michel Dänzer wrote:
> > > On Die, 2011-04-12 at 14:00 +0200, Gabriel Paubert wrote:
> > > > On Tue, Apr 12, 2011 at 01:46:10PM +0200, Michel Dänzer wrote:
> > > > > > 
> > > > > > With no_wb=1 the driver goes a bit further but the X server ends
> > > > > > up in an infinite ioctl loop and the logs are: 
> > > > > 
> > > > > Which ioctl does it loop on? Please provide the Xorg.0.log file as 
> > > > > well.
> > > > 
> > > > From memory, the code was 0x64, which is DRM_RADEON_GEM_WAIT_IDLE.
> > > 
> > > Note that it's normal for this ioctl to be called every time before the
> > > GPU accessible pixmap memory is accessed by the CPU. Unless the ioctl
> > > always returns an error, this may not indicate a problem on its own. 
> > 
> > It seems to be an infinite loop, always returning EINTR because
> > of regular SIGALRM delivery.
> 
> That does sound like the GPU locks up. Do you get any messages in dmesg
> about lockups and attempts to reset the GPU at any time?

No.

Gabriel
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?

2011-04-13 Thread Michel Dänzer

On Mit, 2011-04-13 at 14:27 +0200, Gabriel Paubert wrote: 
> On Wed, Apr 13, 2011 at 02:12:16PM +0200, Michel Dänzer wrote:
> > On Mit, 2011-04-13 at 09:59 +0200, Gabriel Paubert wrote: 
> > > On Tue, Apr 12, 2011 at 07:29:22PM +0200, Michel Dänzer wrote:
> > > > On Die, 2011-04-12 at 14:00 +0200, Gabriel Paubert wrote:
> > > > > On Tue, Apr 12, 2011 at 01:46:10PM +0200, Michel Dänzer wrote:
> > > > > > > 
> > > > > > > With no_wb=1 the driver goes a bit further but the X server ends
> > > > > > > up in an infinite ioctl loop and the logs are: 
> > > > > > 
> > > > > > Which ioctl does it loop on? Please provide the Xorg.0.log file as 
> > > > > > well.
> > > > > 
> > > > > From memory, the code was 0x64, which is DRM_RADEON_GEM_WAIT_IDLE.
> > > > 
> > > > Note that it's normal for this ioctl to be called every time before the
> > > > GPU accessible pixmap memory is accessed by the CPU. Unless the ioctl
> > > > always returns an error, this may not indicate a problem on its own. 
> > > 
> > > It seems to be an infinite loop, always returning EINTR because
> > > of regular SIGALRM delivery.
> > 
> > That does sound like the GPU locks up. Do you get any messages in dmesg
> > about lockups and attempts to reset the GPU at any time?
> 
> No.

Hmm, I guess the constant SIGALRMs might prevent the lockup detection
from kicking in... Maybe you can try starting the X server with
-dumbSched to see if that gets things along any further, but in the end
there's probably no way around figuring out what causes the lockup and
fixing that anyway.


-- 
Earthling Michel Dänzer   |http://www.vmware.com
Libre software enthusiast |  Debian, X and DRI developer
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drm/radeon/kms: fix suspend on rv530 asics

2011-04-13 Thread Jerome Glisse

On Tue, Apr 12, 2011 at 1:33 PM, Alex Deucher  wrote:
> Apparently only rv515 asics need the workaround
> added in f24d86f1a49505cdea56728b853a5d0a3f8e3d11
> (drm/radeon/kms: fix resume regression for some r5xx laptops).
>
> Fixes:
> https://bugs.freedesktop.org/show_bug.cgi?id=34709
>
> Signed-off-by: Alex Deucher 
> Cc: sta...@kernel.org
> ---
>  drivers/gpu/drm/radeon/atom.c |    6 +-
>  1 files changed, 5 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/gpu/drm/radeon/atom.c b/drivers/gpu/drm/radeon/atom.c
> index 258fa5e..d71d375 100644
> --- a/drivers/gpu/drm/radeon/atom.c
> +++ b/drivers/gpu/drm/radeon/atom.c
> @@ -32,6 +32,7 @@
>  #include "atom.h"
>  #include "atom-names.h"
>  #include "atom-bits.h"
> +#include "radeon.h"
>
>  #define ATOM_COND_ABOVE                0
>  #define ATOM_COND_ABOVEOREQUAL 1
> @@ -101,7 +102,9 @@ static void debug_print_spaces(int n)
>  static uint32_t atom_iio_execute(struct atom_context *ctx, int base,
>                                 uint32_t index, uint32_t data)
>  {
> +       struct radeon_device *rdev = ctx->card->dev->dev_private;
>        uint32_t temp = 0xCDCDCDCD;
> +
>        while (1)
>                switch (CU8(base)) {
>                case ATOM_IIO_NOP:
> @@ -112,7 +115,8 @@ static uint32_t atom_iio_execute(struct atom_context 
> *ctx, int base,
>                        base += 3;
>                        break;
>                case ATOM_IIO_WRITE:
> -                       (void)ctx->card->ioreg_read(ctx->card, CU16(base + 
> 1));
> +                       if (rdev->family == CHIP_RV515)
> +                               (void)ctx->card->ioreg_read(ctx->card, 
> CU16(base + 1));
>                        ctx->card->ioreg_write(ctx->card, CU16(base + 1), 
> temp);
>                        base += 3;
>                        break;
> --
> 1.7.1.1
>


So this patch enable io write only for one family ? This looks utterly strange.

Cheers,
Jerome
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drm/radeon/kms: fix suspend on rv530 asics

2011-04-13 Thread Alex Deucher

On Wed, Apr 13, 2011 at 10:46 AM, Jerome Glisse  wrote:
> On Tue, Apr 12, 2011 at 1:33 PM, Alex Deucher  wrote:
>> Apparently only rv515 asics need the workaround
>> added in f24d86f1a49505cdea56728b853a5d0a3f8e3d11
>> (drm/radeon/kms: fix resume regression for some r5xx laptops).
>>
>> Fixes:
>> https://bugs.freedesktop.org/show_bug.cgi?id=34709
>>
>> Signed-off-by: Alex Deucher 
>> Cc: sta...@kernel.org
>> ---
>>  drivers/gpu/drm/radeon/atom.c |    6 +-
>>  1 files changed, 5 insertions(+), 1 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/radeon/atom.c b/drivers/gpu/drm/radeon/atom.c
>> index 258fa5e..d71d375 100644
>> --- a/drivers/gpu/drm/radeon/atom.c
>> +++ b/drivers/gpu/drm/radeon/atom.c
>> @@ -32,6 +32,7 @@
>>  #include "atom.h"
>>  #include "atom-names.h"
>>  #include "atom-bits.h"
>> +#include "radeon.h"
>>
>>  #define ATOM_COND_ABOVE                0
>>  #define ATOM_COND_ABOVEOREQUAL 1
>> @@ -101,7 +102,9 @@ static void debug_print_spaces(int n)
>>  static uint32_t atom_iio_execute(struct atom_context *ctx, int base,
>>                                 uint32_t index, uint32_t data)
>>  {
>> +       struct radeon_device *rdev = ctx->card->dev->dev_private;
>>        uint32_t temp = 0xCDCDCDCD;
>> +
>>        while (1)
>>                switch (CU8(base)) {
>>                case ATOM_IIO_NOP:
>> @@ -112,7 +115,8 @@ static uint32_t atom_iio_execute(struct atom_context 
>> *ctx, int base,
>>                        base += 3;
>>                        break;
>>                case ATOM_IIO_WRITE:
>> -                       (void)ctx->card->ioreg_read(ctx->card, CU16(base + 
>> 1));
>> +                       if (rdev->family == CHIP_RV515)
>> +                               (void)ctx->card->ioreg_read(ctx->card, 
>> CU16(base + 1));
>>                        ctx->card->ioreg_write(ctx->card, CU16(base + 1), 
>> temp);
>>                        base += 3;
>>                        break;
>> --
>> 1.7.1.1
>>
>
>
> So this patch enable io write only for one family ? This looks utterly 
> strange.

No, it just does a read before write for rv515.  I don't know why it
needs it, but it seems to.

Alex

>
> Cheers,
> Jerome
>
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 25588] Lots of ARB_vertex_program/fragment_program parser errors in ETQW (if GLSL is unavailable)

2011-04-13 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=25588

Fabio Pedretti  changed:

   What|Removed |Added

 Resolution|WORKSFORME  |WONTFIX
  Component|Mesa core   |Drivers/DRI/r300
 AssignedTo|mesa-dev@lists.freedesktop. |dri-devel@lists.freedesktop
   |org |.org

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 33222] New: [RADEON] Oops in worker thread for radeon_unpin_work_func

2011-04-13 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=33222

   Summary: [RADEON] Oops in worker thread for
radeon_unpin_work_func
   Product: Drivers
   Version: 2.5
Kernel Version: 2.6.38.2
  Platform: All
OS/Version: Linux
  Tree: Mainline
Status: NEW
  Severity: low
  Priority: P1
 Component: Video(DRI - non Intel)
AssignedTo: drivers_video-...@kernel-bugs.osdl.org
ReportedBy: tho...@m3y3r.de
Regression: No


Created an attachment (id=54282)
 --> (https://bugzilla.kernel.org/attachment.cgi?id=54282)
Oops - Part 1

Few days ago I stumbled upon the attached oops. Just images. sorry for that.
This is the first time I saw this oops. I just hit it once for 2.6.38.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.

--
Forrester Wave Report - Recovery time is now measured in hours and minutes
not days. Key insights are discussed in the 2010 Forrester Wave Report as
part of an in-depth evaluation of disaster recovery service providers.
Forrester found the best-in-class provider in terms of services and vision.
Read this report now!  http://p.sf.net/sfu/ibm-webcastpromo
--
___
Dri-devel mailing list
dri-de...@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 33222] [RADEON] Oops in worker thread for radeon_unpin_work_func

2011-04-13 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=33222





--- Comment #1 from Thomas Meyer   2011-04-13 17:09:48 ---
Created an attachment (id=54292)
 --> (https://bugzilla.kernel.org/attachment.cgi?id=54292)
Oops - Part 2

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.

--
Forrester Wave Report - Recovery time is now measured in hours and minutes
not days. Key insights are discussed in the 2010 Forrester Wave Report as
part of an in-depth evaluation of disaster recovery service providers.
Forrester found the best-in-class provider in terms of services and vision.
Read this report now!  http://p.sf.net/sfu/ibm-webcastpromo
--
___
Dri-devel mailing list
dri-de...@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

small git lesson [Was: Re: Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?]

2011-04-13 Thread Uwe Kleine-König

On Wed, Apr 13, 2011 at 10:02:04AM +0200, Gabriel Paubert wrote:
> On Tue, Apr 12, 2011 at 01:46:10PM +0200, Michel Dänzer wrote:
> > BTW, if your kernel contains commit
> > 69a07f0b117a40fcc1a479358d8e1f41793617f2, can you try if reverting that
> > helps?
> 
> My kernel is pristine 2.6.38 and does not include this commit
> (was introduced before 2.6.39-rc1 according to gitk).
gitk is not the best tool to find this out.

$ git name-rev --refs=refs/tags/v2.6\* 69a07f0b117a40fcc1a479358d8e1f41793617f2
69a07f0b117a40fcc1a479358d8e1f41793617f2 tags/v2.6.39-rc2~3^2~43^2~4

so it was introduced just before -rc2.

Best regards
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | http://www.pengutronix.de/  |
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: small git lesson [Was: Re: Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?]

2011-04-13 Thread Andreas Schwab

Uwe Kleine-König  writes:

> $ git name-rev --refs=refs/tags/v2.6\* 
> 69a07f0b117a40fcc1a479358d8e1f41793617f2
> 69a07f0b117a40fcc1a479358d8e1f41793617f2 tags/v2.6.39-rc2~3^2~43^2~4
>
> so it was introduced just before -rc2.

$ git tag --contains 69a07f0b117a40fcc1a479358d8e1f41793617f2
v2.6.39-rc1
v2.6.39-rc2

Andreas.

-- 
Andreas Schwab, sch...@redhat.com
GPG Key fingerprint = D4E8 DBE3 3813 BB5D FA84  5EC7 45C6 250E 6F00 984E
"And now for something completely different."
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: small git lesson [Was: Re: Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?]

2011-04-13 Thread Uwe Kleine-König

Hello Gabriel
On Wed, Apr 13, 2011 at 12:31:44PM +0200, Gabriel Paubert wrote:
> On Wed, Apr 13, 2011 at 10:59:14AM +0200, Andreas Schwab wrote:
> > Uwe Kleine-König  writes:
> > 
> > > $ git name-rev --refs=refs/tags/v2.6\* 
> > > 69a07f0b117a40fcc1a479358d8e1f41793617f2
> > > 69a07f0b117a40fcc1a479358d8e1f41793617f2 tags/v2.6.39-rc2~3^2~43^2~4
> > >
> > > so it was introduced just before -rc2.
> > 
> > $ git tag --contains 69a07f0b117a40fcc1a479358d8e1f41793617f2
> > v2.6.39-rc1
> > v2.6.39-rc2
> > 
> 
> So who is right? I think it was before rc1. 
Yep, correct. I interpreted the output of git name-rev to mean it's not
included in a tag earlier than v2.6.39-rc2, but actually that's wrong.
It's just that it's easier (for some definition of easy) to reach the
commit in question from v2.6.39-rc2 than from v2.6.39-rc1.

> However in this case the main reason to fire gitk was to have a quick look 
> at the patch and its context, and simply reported the "Precedes" line 
> in the display, which is 2.6.39-rc1. It also follow v2.6.37-rc2, which means
> that it has been quite a long time outside the main tree.
I think this conclusion isn't valid in general. (E.g. in git itself a
bug-fix is often done on top of the commit that introduced it and than
merged into master. Still the bugfix might be new.) But looking at the
AuthorDate of 69a07f0b117a seems to support your statement.

Best regards
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | http://www.pengutronix.de/  |
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 33222] [RADEON] Oops in worker thread for radeon_unpin_work_func

2011-04-13 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=33222


Alex Deucher  changed:

   What|Removed |Added

 CC||alexdeuc...@gmail.com




--- Comment #2 from Alex Deucher   2011-04-13 17:19:06 
---
This is a duplicate of bug 32402.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.

--
Forrester Wave Report - Recovery time is now measured in hours and minutes
not days. Key insights are discussed in the 2010 Forrester Wave Report as
part of an in-depth evaluation of disaster recovery service providers.
Forrester found the best-in-class provider in terms of services and vision.
Read this report now!  http://p.sf.net/sfu/ibm-webcastpromo
--
___
Dri-devel mailing list
dri-de...@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Linux 2.6.39-rc3

2011-04-13 Thread Joerg Roedel

On Wed, Apr 13, 2011 at 08:46:09AM +0200, Ingo Molnar wrote:
> Could you please send the before/after bootlog (in particular all memory init 
> messages included) and your .config?
> 
>  before:  f005fe12b90c: x86-64: Move out cleanup higmap [_brk_end, _end) out 
> of init_memory_mapping()
>   after:  d2137d5af425: Merge branch 'linus' into x86/bootmem
> 
> I've Cc:-ed more people who might have an idea about it.

Okay, I have done some more bisecting and debugging today.

First of all, I bisected between v2.6.37-rc2..f005fe12b90c which where
only a couple of patches and merged v2.6.38-rc4 in at every step. There
was no failure found.
Then I tried this again, but this time I merged v2.6.38-rc5 at every
step and was successful. The bad commit in this branch turned out to be

1a4a678b12c84db9ae5dce424e0e97f0559bb57c

which is related to memblock.

Then I tried to find out which change between 2.6.38-rc4 and 2.6.38-rc5
is needed to trigger the failure, so I used f005fe12b90c as a base,
bisected between v2.6.38-rc4..v2.6.38-rc5 and merged every bisect step
into the base and tested. Here the bad commit turned out to be

e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20

which is related to gart. It turned out that the gart aperture on that
box is on another position with these patches. Before it was as
0xa400 and now it is at 0xa000. It seems like this has something
to do with the root-cause.

Reverting commit 1a4a678b12c84db9ae5dce424e0e97f0559bb57c fixes the
problem btw. and booting with iommu=soft also works, but I have no idea
yet why the aperture at that address is a problem (with the patch
reverted the aperture lands at 0x8000).

I have put some debug-data online. There is my .config and two
dmesg-files for good (==2.6.39-rc3 + revert) and bad (==2.6.39-rc3)
I also created these dmesg-files again with memblock=debug, maybe that
helps to find the problem. The files are at

http://www.8bytes.org/~joro/debug/

Or someone else has an idea about the issue...

Joerg

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 30651] [RADEON:KMS:R600G] gl output in mplayer have no colors if used with a fragment program with additional lookup and bicubic B-spline filtering

2011-04-13 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=30651

--- Comment #7 from Andy Furniss  2011-04-13 
10:23:46 PDT ---
(In reply to comment #6)

> 1) yuv=4 on r600g still have no colours even though with r300g they are ok

yuv=4 with or without bicubic now works for me on 600g

> 2) still there is an overbright glitch in some white places in some videos 
> with
> yuv=6. but again, it may be a mplayer bug since it present with r300g too (but
> not software rasterizer), i'm not sure.

This is still the same.

One general observation is the with 600g perf is poor compared to 600c or xv,
which are at least 2x faster when benchmarking with HD streams.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 30651] [RADEON:KMS:R600G] gl output in mplayer have no colors if used with a fragment program with additional lookup and bicubic B-spline filtering

2011-04-13 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=30651

--- Comment #8 from Sergey Kondakov  2011-04-13 10:46:29 
PDT ---
same here.
and i never got answer about which method is better with amd/ati card and open
stack now. i hope devs are looking into that stuff.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 30651] [RADEON:KMS:R600G] gl output in mplayer have no colors if used with a fragment program with additional lookup and bicubic B-spline filtering

2011-04-13 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=30651

--- Comment #9 from Andy Furniss  2011-04-13 
11:38:47 PDT ---
(In reply to comment #8)
> same here.
> and i never got answer about which method is better with amd/ati card and open
> stack now. i hope devs are looking into that stuff.

Maybe there isn't an answer as such for that question.

I guess someone with an on-board low spec GPU may be more limited than a high
end card with fast vram.

Quality wise - I can't see any difference, the higher yuv= numbers give more
features like gamma correction (not sure how to use it though).

It would be nice if 600g could beat or equal 600c - it does for 3D, but for
some reason not this.

I said classic was twice as fast - it's actually more than that if I discount
time taken by the codec.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 32982] Kernel locks up a few minutes after boot

2011-04-13 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=32982





--- Comment #6 from Bart Van Assche   2011-04-13 
18:49:13 ---
Although I'm still busy bisecting, I'd like to report that I got the following
hung task report with head b73a21fc66fee35b41db755abebfacba48b2fc76 (had
already seen something similar before with 2.6.39-rc2):

INFO: task kjournald:918 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kjournald   D 880131b9ddb8 0   918  2 0x
 880131b9dd20 0046 880131b9dca0 8108cd6d
 0282 880131b9dfd8 880137729f40 880131b9dfd8
 880131b9c000 880131b9c000 880131b9c000 880131b9dfd8
Call Trace:
 [] ? trace_hardirqs_on_caller+0x14d/0x190
 [] ? sub_preempt_count+0xa9/0xe0
 [] journal_commit_transaction+0x13e/0x1590 [jbd]
 [] ? _raw_spin_unlock_irqrestore+0x65/0x80
 [] ? sub_preempt_count+0xa9/0xe0
 [] ? wake_up_bit+0x40/0x40
 [] ? del_timer_sync+0x8a/0xc0
 [] ? try_to_del_timer_sync+0x110/0x110
 [] kjournald+0xf1/0x250 [jbd]
 [] ? wake_up_bit+0x40/0x40
 [] ? commit_timeout+0x10/0x10 [jbd]
 [] kthread+0x96/0xa0
 [] kernel_thread_helper+0x4/0x10
 [] ? finish_task_switch+0x7b/0xe0
 [] ? _raw_spin_unlock_irq+0x3b/0x60
 [] ? retint_restore_args+0xe/0xe
 [] ? __init_kthread_worker+0x70/0x70
 [] ? gs_change+0xb/0xb
no locks held by kjournald/918.
INFO: task klauncher:5744 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
klauncher   D 0001000297b4 0  5744   5743 0x
 88011dd73938 0046 8801 8108cbef
 813e2535 88011dd73fd8 8801382e1f40 88011dd73fd8
 88011dd72000 88011dd72000 88011dd72000 88011dd73fd8
Call Trace:
 [] ? mark_held_locks+0x6f/0xa0
 [] ? _raw_spin_unlock_irqrestore+0x65/0x80
 [] ? __wait_on_buffer+0x30/0x30
 [] io_schedule+0x59/0x80
 [] sleep_on_buffer+0xe/0x20
 [] __wait_on_bit_lock+0x5a/0xc0
 [] ? __wait_on_buffer+0x30/0x30
 [] out_of_line_wait_on_bit_lock+0x78/0x90
 [] ? autoremove_wake_function+0x50/0x50
 [] __lock_buffer+0x36/0x40
 [] do_get_write_access+0x64d/0x660 [jbd]
 [] ? sub_preempt_count+0xa9/0xe0
 [] ? start_this_handle+0x370/0x470 [jbd]
 [] ? journal_add_journal_head+0xf4/0x220 [jbd]
 [] journal_get_write_access+0x31/0x50 [jbd]
 [] __ext3_journal_get_write_access+0x2d/0x60 [ext3]
 [] ext3_reserve_inode_write+0x83/0xb0 [ext3]
 [] ext3_mark_inode_dirty+0x44/0x70 [ext3]
 [] ext3_dirty_inode+0x5e/0xa0 [ext3]
 [] __mark_inode_dirty+0x3f/0x250
 [] file_update_time+0xec/0x170
 [] ? mutex_lock_nested+0x27d/0x3a0
 [] __generic_file_aio_write+0x1f8/0x440
 [] generic_file_aio_write+0x75/0xf0
 [] do_sync_write+0xda/0x120
 [] ? remove_vma+0x77/0x90
 [] ? trace_hardirqs_on+0xd/0x10
 [] ? remove_vma+0x77/0x90
 [] vfs_write+0xc6/0x170
 [] sys_write+0x51/0x90
 [] system_call_fastpath+0x16/0x1b
2 locks held by klauncher/5744:
 #0:  (&sb->s_type->i_mutex_key#11){+.+.+.}, at: []
generic_file_aio_write+0x59/0xf0
 #1:  (jbd_handle){+.+...}, at: []
start_this_handle+0x370/0x470 [jbd]
INFO: task okular:4180 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
okular  D 00010002a251 0  4180   5743 0x
 880041d13aa8 0046 8800 8108cd6d
 0282 880041d13fd8 880037a59f40 880041d13fd8
 880041d12000 880041d12000 880041d12000 880041d13fd8
Call Trace:
 [] ? trace_hardirqs_on_caller+0x14d/0x190
 [] start_this_handle+0x244/0x470 [jbd]
 [] ? is_module_address+0x33/0x60
 [] ? wake_up_bit+0x40/0x40
 [] journal_start+0xdb/0x120 [jbd]
 [] ext3_journal_start_sb+0x36/0x70 [ext3]
 [] ext3_setattr+0x1a3/0x210 [ext3]
 [] notify_change+0x116/0x360
 [] do_truncate+0x63/0x90
 [] ? sub_preempt_count+0xa9/0xe0
 [] do_last+0x42c/0x820
 [] path_openat+0xd0/0x410
 [] ? might_fault+0x53/0xb0
 [] do_filp_open+0x7f/0xa0
 [] ? sub_preempt_count+0xa9/0xe0
 [] ? _raw_spin_unlock+0x35/0x60
 [] ? alloc_fd+0xf4/0x150
 [] do_sys_open+0x101/0x1e0
 [] sys_open+0x20/0x30
 [] system_call_fastpath+0x16/0x1b
2 locks held by okular/4180:
 #0:  (&sb->s_type->i_mutex_key#11){+.+.+.}, at: []
do_truncate+0x57/0x90
 #1:  (&sb->s_type->i_alloc_sem_key#4){+.+...}, at: []
notify_change+0x2a0/0x360

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.

--
Forrester Wave Report - Recovery time is now measured in hours and minutes
not days. Key insights are discussed in the 2010 Forrester Wave Report as
part of an in-depth evaluation of disaster recovery service providers.
Forrester found the best-in-class provider in terms of services and vision.
Read this report now!  http://p.sf.net/sfu/ibm-webcastpromo
--
__

Re: Linux 2.6.39-rc3

2011-04-13 Thread H. Peter Anvin

On 04/13/2011 10:21 AM, Joerg Roedel wrote:
> 
> First of all, I bisected between v2.6.37-rc2..f005fe12b90c which where
> only a couple of patches and merged v2.6.38-rc4 in at every step. There
> was no failure found.
> Then I tried this again, but this time I merged v2.6.38-rc5 at every
> step and was successful. The bad commit in this branch turned out to be
> 
>   1a4a678b12c84db9ae5dce424e0e97f0559bb57c
> 
> which is related to memblock.
> 
> Then I tried to find out which change between 2.6.38-rc4 and 2.6.38-rc5
> is needed to trigger the failure, so I used f005fe12b90c as a base,
> bisected between v2.6.38-rc4..v2.6.38-rc5 and merged every bisect step
> into the base and tested. Here the bad commit turned out to be
> 
>   e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20
> 
> which is related to gart. It turned out that the gart aperture on that
> box is on another position with these patches. Before it was as
> 0xa400 and now it is at 0xa000. It seems like this has something
> to do with the root-cause.
> 
> Reverting commit 1a4a678b12c84db9ae5dce424e0e97f0559bb57c fixes the
> problem btw. and booting with iommu=soft also works, but I have no idea
> yet why the aperture at that address is a problem (with the patch
> reverted the aperture lands at 0x8000).
> 

Does reverting e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20 solve the
problem for you?

1a4a678b12c84db9ae5dce424e0e97f0559bb57c is a memory-allocation-order
patch, which have a nasty tendency to unmask bugs elsewhere in the
kernel.  However, e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20 looks
positively strange (and it doesn't exactly help that the description is
written in Yinghai-ese and is therefore nearly impossible to decode,
never mind tell if it is remotely correct.)

-hpa


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Linux 2.6.39-rc3

2011-04-13 Thread Yinghai Lu

On 04/13/2011 10:21 AM, Joerg Roedel wrote:
> On Wed, Apr 13, 2011 at 08:46:09AM +0200, Ingo Molnar wrote:
> First of all, I bisected between v2.6.37-rc2..f005fe12b90c which where
> only a couple of patches and merged v2.6.38-rc4 in at every step. There
> was no failure found.
> Then I tried this again, but this time I merged v2.6.38-rc5 at every
> step and was successful. The bad commit in this branch turned out to be
> 
>   1a4a678b12c84db9ae5dce424e0e97f0559bb57c
> 
> which is related to memblock.
> 
> Then I tried to find out which change between 2.6.38-rc4 and 2.6.38-rc5
> is needed to trigger the failure, so I used f005fe12b90c as a base,
> bisected between v2.6.38-rc4..v2.6.38-rc5 and merged every bisect step
> into the base and tested. Here the bad commit turned out to be
> 
>   e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20
> 
> which is related to gart. It turned out that the gart aperture on that
> box is on another position with these patches. Before it was as
> 0xa400 and now it is at 0xa000. It seems like this has something
> to do with the root-cause.
> 
> Reverting commit 1a4a678b12c84db9ae5dce424e0e97f0559bb57c fixes the
> problem btw. and booting with iommu=soft also works, but I have no idea
> yet why the aperture at that address is a problem (with the patch
> reverted the aperture lands at 0x8000).
> 
> I have put some debug-data online. There is my .config and two
> dmesg-files for good (==2.6.39-rc3 + revert) and bad (==2.6.39-rc3)
> I also created these dmesg-files again with memblock=debug, maybe that
> helps to find the problem. The files are at
> 
>   http://www.8bytes.org/~joro/debug/

thanks for the bisecting...

so those two patches uncover some problems.

[0.00] Checking aperture...
[0.00] No AGP bridge found
[0.00] Node 0: aperture @ a000 size 32 MB
[0.00] Aperture pointing to e820 RAM. Ignoring.
[0.00] Your BIOS doesn't leave a aperture memory hole
[0.00] Please enable the IOMMU option in the BIOS setup
[0.00] This costs you 64 MB of RAM
[0.00] memblock_x86_reserve_range: [0xa000-0xa3ff]   
aperture64
[0.00] Mapping aperture over 65536 KB of RAM @ a000

so kernel try to reallocate apperture. because BIOS allocated is pointed to RAM 
or size is too small.

but your radeon does use [0xa000, 0xbfff)

[4.281993] radeon :01:05.0: VRAM: 320M 0xC000 - 
0xD3FF (320M used)
[4.290672] radeon :01:05.0: GTT: 512M 0xA000 - 
0xBFFF
[4.298550] [drm] Detected VRAM RAM=320M, BAR=256M
[4.309857] [drm] RAM width 32bits DDR
[4.313748] [TTM] Zone  kernel: Available graphics memory: 1896524 kiB.
[4.320379] [TTM] Initializing pool allocator.
[4.324948] [drm] radeon: 320M of VRAM memory ready
[4.329832] [drm] radeon: 512M of GTT memory ready.

and the one seems working:

[0.00] Checking aperture...
[0.00] No AGP bridge found
[0.00] Node 0: aperture @ a000 size 32 MB
[0.00] Aperture pointing to e820 RAM. Ignoring.
[0.00] Your BIOS doesn't leave a aperture memory hole
[0.00] Please enable the IOMMU option in the BIOS setup
[0.00] This costs you 64 MB of RAM
[0.00] memblock_x86_reserve_range: [0x8000-0x83ff]   
aperture64
[0.00] Mapping aperture over 65536 KB of RAM @ 8000
[0.00] memblock_x86_reserve_range: [0xacb6bdc0-0xacb6bddf]  
BOOTMEM

will use different position...

[4.250159] radeon :01:05.0: VRAM: 320M 0xC000 - 
0xD3FF (320M used)
[4.258830] radeon :01:05.0: GTT: 512M 0xA000 - 
0xBFFF
[4.266742] [drm] Detected VRAM RAM=320M, BAR=256M
[4.271549] [drm] RAM width 32bits DDR
[4.275435] [TTM] Zone  kernel: Available graphics memory: 1896526 kiB.
[4.282066] [TTM] Initializing pool allocator.
[4.282085] usb 7-2: new full speed USB device number 2 using ohci_hcd
[4.293076] [drm] radeon: 320M of VRAM memory ready
[4.298277] [drm] radeon: 512M of GTT memory ready.
[4.303218] [drm] Supports vblank timestamp caching Rev 1 (10.10.2010).
[4.309854] [drm] Driver supports precise vblank timestamp query.
[4.315970] [drm] radeon: irq initialized.
[4.320094] [drm] GART: num cpu pages 131072, num gpu pages 131072

So question is why radeon is using the address [0xa000 - 0xc00], and in 
E820 it is RAM 

[0.00]  BIOS-e820: 0010 - acb8d000 (usable)
[0.00]  BIOS-e820: acb8d000 - acb8f000 (reserved)
[0.00]  BIOS-e820: acb8f000 - afce9000 (usable)
[0.00]  BIOS-e820: afce9000 - afd21000 (reserved)
[0.00]  BIOS-e820: afd21000 - afd4f000 (usable)
[0.00]  BIOS-e820: afd4f000 - afdcf000 (reserved)
[0.00]  BIOS-e820: afdcf000

Re: Linux 2.6.39-rc3

2011-04-13 Thread H. Peter Anvin

On 04/13/2011 10:21 AM, Joerg Roedel wrote:
> On Wed, Apr 13, 2011 at 08:46:09AM +0200, Ingo Molnar wrote:
>> Could you please send the before/after bootlog (in particular all memory 
>> init 
>> messages included) and your .config?
>>
>>  before:  f005fe12b90c: x86-64: Move out cleanup higmap [_brk_end, _end) out 
>> of init_memory_mapping()
>>   after:  d2137d5af425: Merge branch 'linus' into x86/bootmem
>>
>> I've Cc:-ed more people who might have an idea about it.
> 
> Okay, I have done some more bisecting and debugging today.
> 

First of all, *huge* thanks for this effort.  At least we need to track
down the bits that need to be reverted -- it is past rc3, and it's time
to see what we should revert and tell the submitter to try again next cycle.

This looks to be the same issue as in bugzilla 33012:

https://bugzilla.kernel.org/show_bug.cgi?id=33012

... so it would be good if we could keep the information in there.

-hpa
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Linux 2.6.39-rc3

2011-04-13 Thread Joerg Roedel

On Wed, Apr 13, 2011 at 11:51:39AM -0700, H. Peter Anvin wrote:
> On 04/13/2011 10:21 AM, Joerg Roedel wrote:
> > 
> > First of all, I bisected between v2.6.37-rc2..f005fe12b90c which where
> > only a couple of patches and merged v2.6.38-rc4 in at every step. There
> > was no failure found.
> > Then I tried this again, but this time I merged v2.6.38-rc5 at every
> > step and was successful. The bad commit in this branch turned out to be
> > 
> > 1a4a678b12c84db9ae5dce424e0e97f0559bb57c
> > 
> > which is related to memblock.
> > 
> > Then I tried to find out which change between 2.6.38-rc4 and 2.6.38-rc5
> > is needed to trigger the failure, so I used f005fe12b90c as a base,
> > bisected between v2.6.38-rc4..v2.6.38-rc5 and merged every bisect step
> > into the base and tested. Here the bad commit turned out to be
> > 
> > e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20
> > 
> > which is related to gart. It turned out that the gart aperture on that
> > box is on another position with these patches. Before it was as
> > 0xa400 and now it is at 0xa000. It seems like this has something
> > to do with the root-cause.
> > 
> > Reverting commit 1a4a678b12c84db9ae5dce424e0e97f0559bb57c fixes the
> > problem btw. and booting with iommu=soft also works, but I have no idea
> > yet why the aperture at that address is a problem (with the patch
> > reverted the aperture lands at 0x8000).
> > 
> 
> Does reverting e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20 solve the
> problem for you?

No, reverting that patch doesn't make the problem go away (and the gart
aperture is still on 0xa000). I tested this in 39-rc3, I havn't
tested if it makes a difference on the original bisect-commit from Ingo,
probably it does (don't know if that matters).
Strange about this commit is that it fixes an x86 gart aperture
allocation bug in generic memblock code.

> 1a4a678b12c84db9ae5dce424e0e97f0559bb57c is a memory-allocation-order
> patch, which have a nasty tendency to unmask bugs elsewhere in the
> kernel.  However, e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20 looks
> positively strange (and it doesn't exactly help that the description is
> written in Yinghai-ese and is therefore nearly impossible to decode,
> never mind tell if it is remotely correct.)

I think that the two commits are okay and the bug is somewhere else, but
I have no idea yet were to look next. I spent some time looking at
radeon code and talking to Alex about it (because it seemed suspicous
that the GTT is on 0xa000 too, but as Alex explained me this is an
address in the GPU address space and shouldn't matter).

Regards,

   Joerg

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Linux 2.6.39-rc3

2011-04-13 Thread Joerg Roedel

On Wed, Apr 13, 2011 at 11:39:29AM -0700, H. Peter Anvin wrote:
> On 04/13/2011 10:21 AM, Joerg Roedel wrote:
> > On Wed, Apr 13, 2011 at 08:46:09AM +0200, Ingo Molnar wrote:
> >> Could you please send the before/after bootlog (in particular all memory 
> >> init 
> >> messages included) and your .config?
> >>
> >>  before:  f005fe12b90c: x86-64: Move out cleanup higmap [_brk_end, _end) 
> >> out of init_memory_mapping()
> >>   after:  d2137d5af425: Merge branch 'linus' into x86/bootmem
> >>
> >> I've Cc:-ed more people who might have an idea about it.
> > 
> > Okay, I have done some more bisecting and debugging today.
> > 
> 
> First of all, *huge* thanks for this effort.  At least we need to track
> down the bits that need to be reverted -- it is past rc3, and it's time
> to see what we should revert and tell the submitter to try again next cycle.
> 
> This looks to be the same issue as in bugzilla 33012:
> 
>   https://bugzilla.kernel.org/show_bug.cgi?id=33012
> 
> ... so it would be good if we could keep the information in there.

Yes, I try to find my korg bugzilla account again and drop the
information from this email there.

Joerg

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Linux 2.6.39-rc3

2011-04-13 Thread Joerg Roedel

On Wed, Apr 13, 2011 at 12:14:55PM -0700, Yinghai Lu wrote:
> thanks for the bisecting...
> 
> so those two patches uncover some problems.
> 
> [0.00] Checking aperture...
> [0.00] No AGP bridge found
> [0.00] Node 0: aperture @ a000 size 32 MB
> [0.00] Aperture pointing to e820 RAM. Ignoring.
> [0.00] Your BIOS doesn't leave a aperture memory hole
> [0.00] Please enable the IOMMU option in the BIOS setup
> [0.00] This costs you 64 MB of RAM
> [0.00] memblock_x86_reserve_range: [0xa000-0xa3ff]   
> aperture64
> [0.00] Mapping aperture over 65536 KB of RAM @ a000
> 
> so kernel try to reallocate apperture. because BIOS allocated is pointed to 
> RAM or size is too small.

It is actually beyond 4GB on that machine, this value read here is from
the previous kernel-boot. The BIOS does not reset these values on a
reboot.

> but your radeon does use [0xa000, 0xbfff)

Yes, I suspected that too (and spent a few hours reading radeon code),
but then I talked the Alex Deucher and he explained that these addresses
which the driver prints for GTT and VRAM are in the GPU address space
and do not refer to system ram. So this shouldn't be the problem.

Joerg

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Linux 2.6.39-rc3

2011-04-13 Thread Alex Deucher

On Wed, Apr 13, 2011 at 3:14 PM, Yinghai Lu  wrote:
> On 04/13/2011 10:21 AM, Joerg Roedel wrote:
>> On Wed, Apr 13, 2011 at 08:46:09AM +0200, Ingo Molnar wrote:
>> First of all, I bisected between v2.6.37-rc2..f005fe12b90c which where
>> only a couple of patches and merged v2.6.38-rc4 in at every step. There
>> was no failure found.
>> Then I tried this again, but this time I merged v2.6.38-rc5 at every
>> step and was successful. The bad commit in this branch turned out to be
>>
>>       1a4a678b12c84db9ae5dce424e0e97f0559bb57c
>>
>> which is related to memblock.
>>
>> Then I tried to find out which change between 2.6.38-rc4 and 2.6.38-rc5
>> is needed to trigger the failure, so I used f005fe12b90c as a base,
>> bisected between v2.6.38-rc4..v2.6.38-rc5 and merged every bisect step
>> into the base and tested. Here the bad commit turned out to be
>>
>>       e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20
>>
>> which is related to gart. It turned out that the gart aperture on that
>> box is on another position with these patches. Before it was as
>> 0xa400 and now it is at 0xa000. It seems like this has something
>> to do with the root-cause.
>>
>> Reverting commit 1a4a678b12c84db9ae5dce424e0e97f0559bb57c fixes the
>> problem btw. and booting with iommu=soft also works, but I have no idea
>> yet why the aperture at that address is a problem (with the patch
>> reverted the aperture lands at 0x8000).
>>
>> I have put some debug-data online. There is my .config and two
>> dmesg-files for good (==2.6.39-rc3 + revert) and bad (==2.6.39-rc3)
>> I also created these dmesg-files again with memblock=debug, maybe that
>> helps to find the problem. The files are at
>>
>>       http://www.8bytes.org/~joro/debug/
>
> thanks for the bisecting...
>
> so those two patches uncover some problems.
>
> [    0.00] Checking aperture...
> [    0.00] No AGP bridge found
> [    0.00] Node 0: aperture @ a000 size 32 MB
> [    0.00] Aperture pointing to e820 RAM. Ignoring.
> [    0.00] Your BIOS doesn't leave a aperture memory hole
> [    0.00] Please enable the IOMMU option in the BIOS setup
> [    0.00] This costs you 64 MB of RAM
> [    0.00]     memblock_x86_reserve_range: [0xa000-0xa3ff]       
> aperture64
> [    0.00] Mapping aperture over 65536 KB of RAM @ a000
>
> so kernel try to reallocate apperture. because BIOS allocated is pointed to 
> RAM or size is too small.
>
> but your radeon does use [0xa000, 0xbfff)
>
> [    4.281993] radeon :01:05.0: VRAM: 320M 0xC000 - 
> 0xD3FF (320M used)
> [    4.290672] radeon :01:05.0: GTT: 512M 0xA000 - 
> 0xBFFF
> [    4.298550] [drm] Detected VRAM RAM=320M, BAR=256M
> [    4.309857] [drm] RAM width 32bits DDR
> [    4.313748] [TTM] Zone  kernel: Available graphics memory: 1896524 kiB.
> [    4.320379] [TTM] Initializing pool allocator.
> [    4.324948] [drm] radeon: 320M of VRAM memory ready
> [    4.329832] [drm] radeon: 512M of GTT memory ready.
>
> and the one seems working:
>
> [    0.00] Checking aperture...
> [    0.00] No AGP bridge found
> [    0.00] Node 0: aperture @ a000 size 32 MB
> [    0.00] Aperture pointing to e820 RAM. Ignoring.
> [    0.00] Your BIOS doesn't leave a aperture memory hole
> [    0.00] Please enable the IOMMU option in the BIOS setup
> [    0.00] This costs you 64 MB of RAM
> [    0.00]     memblock_x86_reserve_range: [0x8000-0x83ff]       
> aperture64
> [    0.00] Mapping aperture over 65536 KB of RAM @ 8000
> [    0.00]     memblock_x86_reserve_range: [0xacb6bdc0-0xacb6bddf]        
>   BOOTMEM
>
> will use different position...
>
> [    4.250159] radeon :01:05.0: VRAM: 320M 0xC000 - 
> 0xD3FF (320M used)
> [    4.258830] radeon :01:05.0: GTT: 512M 0xA000 - 
> 0xBFFF
> [    4.266742] [drm] Detected VRAM RAM=320M, BAR=256M
> [    4.271549] [drm] RAM width 32bits DDR
> [    4.275435] [TTM] Zone  kernel: Available graphics memory: 1896526 kiB.
> [    4.282066] [TTM] Initializing pool allocator.
> [    4.282085] usb 7-2: new full speed USB device number 2 using ohci_hcd
> [    4.293076] [drm] radeon: 320M of VRAM memory ready
> [    4.298277] [drm] radeon: 512M of GTT memory ready.
> [    4.303218] [drm] Supports vblank timestamp caching Rev 1 (10.10.2010).
> [    4.309854] [drm] Driver supports precise vblank timestamp query.
> [    4.315970] [drm] radeon: irq initialized.
> [    4.320094] [drm] GART: num cpu pages 131072, num gpu pages 131072
>
> So question is why radeon is using the address [0xa000 - 0xc00], and 
> in E820 it is RAM 

The VRAM and GTT addresses in the dmesg are internal GPU addresses not
system addresses.  The GPU has it's own internal address space for
on-chip memory clients (texture samplers, render buffers, display
controllers, etc.).  The GPU sets up two apertures in it's internal
addres

Re: [PATCH] drm/radeon/kms: fix suspend on rv530 asics

2011-04-13 Thread Dave Airlie

On Thu, Apr 14, 2011 at 12:52 AM, Alex Deucher  wrote:
> On Wed, Apr 13, 2011 at 10:46 AM, Jerome Glisse  wrote:
>> On Tue, Apr 12, 2011 at 1:33 PM, Alex Deucher  wrote:
>>> Apparently only rv515 asics need the workaround
>>> added in f24d86f1a49505cdea56728b853a5d0a3f8e3d11
>>> (drm/radeon/kms: fix resume regression for some r5xx laptops).
>>>
>>> Fixes:
>>> https://bugs.freedesktop.org/show_bug.cgi?id=34709
>>>
>>> Signed-off-by: Alex Deucher 
>>> Cc: sta...@kernel.org
>>> ---
>>>  drivers/gpu/drm/radeon/atom.c |    6 +-
>>>  1 files changed, 5 insertions(+), 1 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/radeon/atom.c b/drivers/gpu/drm/radeon/atom.c
>>> index 258fa5e..d71d375 100644
>>> --- a/drivers/gpu/drm/radeon/atom.c
>>> +++ b/drivers/gpu/drm/radeon/atom.c
>>> @@ -32,6 +32,7 @@
>>>  #include "atom.h"
>>>  #include "atom-names.h"
>>>  #include "atom-bits.h"
>>> +#include "radeon.h"
>>>
>>>  #define ATOM_COND_ABOVE                0
>>>  #define ATOM_COND_ABOVEOREQUAL 1
>>> @@ -101,7 +102,9 @@ static void debug_print_spaces(int n)
>>>  static uint32_t atom_iio_execute(struct atom_context *ctx, int base,
>>>                                 uint32_t index, uint32_t data)
>>>  {
>>> +       struct radeon_device *rdev = ctx->card->dev->dev_private;
>>>        uint32_t temp = 0xCDCDCDCD;
>>> +
>>>        while (1)
>>>                switch (CU8(base)) {
>>>                case ATOM_IIO_NOP:
>>> @@ -112,7 +115,8 @@ static uint32_t atom_iio_execute(struct atom_context 
>>> *ctx, int base,
>>>                        base += 3;
>>>                        break;
>>>                case ATOM_IIO_WRITE:
>>> -                       (void)ctx->card->ioreg_read(ctx->card, CU16(base + 
>>> 1));
>>> +                       if (rdev->family == CHIP_RV515)
>>> +                               (void)ctx->card->ioreg_read(ctx->card, 
>>> CU16(base + 1));
>>>                        ctx->card->ioreg_write(ctx->card, CU16(base + 1), 
>>> temp);
>>>                        base += 3;
>>>                        break;
>>> --
>>> 1.7.1.1
>>>
>>
>>
>> So this patch enable io write only for one family ? This looks utterly 
>> strange.
>
> No, it just does a read before write for rv515.  I don't know why it
> needs it, but it seems to.
>

Yeah I really wish I knew why either,

Thinkpad T60 with X1300, no resume without this, it failed in the
memory initialisation table. this was the only thing I could find to
fix it.

My x1300 desktop card works fine without this.

Dave.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?

2011-04-13 Thread Andy Furniss


Michel Dänzer wrote:


That does sound like the GPU locks up. Do you get any messages in dmesg
about lockups and attempts to reset the GPU at any time?


No.


Hmm, I guess the constant SIGALRMs might prevent the lockup detection
from kicking in... Maybe you can try starting the X server with
-dumbSched to see if that gets things along any further, but in the end
there's probably no way around figuring out what causes the lockup and
fixing that anyway.


I have an old AGP box that locks with 600g + agpgart - It used to give 
GPU lockup to dmesg/log, but (I only test it occasionally) it doesn't 
anymore. I can still sysrq OK.


I wonder if something changed in recent months in the drm/whatever code 
that has changed/blocked the logging.

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 36221] New: KMS with X1950 XT i2c error --> no ddc

2011-04-13 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=36221

   Summary: KMS with X1950 XT i2c error --> no ddc
   Product: DRI
   Version: unspecified
  Platform: All
OS/Version: All
Status: NEW
  Severity: critical
  Priority: medium
 Component: DRM/Radeon
AssignedTo: dri-devel@lists.freedesktop.org
ReportedBy: revea...@freakmail.de


Hello!

This is the operating system and kernel:

cat /etc/SuSE-release
openSUSE 11.4 (i586)
VERSION = 11.4
CODENAME = Celadon

uname -rio
2.6.37.1-1.2-desktop i386 GNU/Linux


When trying to boot with Kernelmodesetting there is no DDC due to an i2c error
resulting in a blank screen.

I was told in irc channel #radeon to open a bugreport and attache vbios.rom and
dmesg of a boot with KMS enabled;

I hope you can help me out!

Many thanks for all your Help!

Greetings,

R

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 36221] KMS with X1950 XT i2c error --> no ddc

2011-04-13 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=36221

--- Comment #1 from revealed  2011-04-13 13:41:58 PDT ---
Created an attachment (id=45589)
 --> (https://bugs.freedesktop.org/attachment.cgi?id=45589)
vbios.rom

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 36221] KMS with X1950 XT i2c error --> no ddc

2011-04-13 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=36221

--- Comment #2 from revealed  2011-04-13 13:43:06 PDT ---
Created an attachment (id=45590)
 --> (https://bugs.freedesktop.org/attachment.cgi?id=45590)
Full dmesg containing the i2c error

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Linux 2.6.39-rc3

2011-04-13 Thread Yinghai Lu

On 04/13/2011 12:34 PM, Joerg Roedel wrote:
> On Wed, Apr 13, 2011 at 12:14:55PM -0700, Yinghai Lu wrote:
>> thanks for the bisecting...
>>
>> so those two patches uncover some problems.
>>
>> [0.00] Checking aperture...
>> [0.00] No AGP bridge found
>> [0.00] Node 0: aperture @ a000 size 32 MB
>> [0.00] Aperture pointing to e820 RAM. Ignoring.
>> [0.00] Your BIOS doesn't leave a aperture memory hole
>> [0.00] Please enable the IOMMU option in the BIOS setup
>> [0.00] This costs you 64 MB of RAM
>> [0.00] memblock_x86_reserve_range: [0xa000-0xa3ff]   
>> aperture64
>> [0.00] Mapping aperture over 65536 KB of RAM @ a000
>>
>> so kernel try to reallocate apperture. because BIOS allocated is pointed to 
>> RAM or size is too small.
> 
> It is actually beyond 4GB on that machine, this value read here is from
> the previous kernel-boot. The BIOS does not reset these values on a
> reboot.
> 
>> but your radeon does use [0xa000, 0xbfff)
> 
> Yes, I suspected that too (and spent a few hours reading radeon code),
> but then I talked the Alex Deucher and he explained that these addresses
> which the driver prints for GTT and VRAM are in the GPU address space
> and do not refer to system ram. So this shouldn't be the problem.


can you try following change ? it will push gart to 0x8000

diff --git a/arch/x86/kernel/aperture_64.c b/arch/x86/kernel/aperture_64.c
index 86d1ad4..3b6a9d5 100644
--- a/arch/x86/kernel/aperture_64.c
+++ b/arch/x86/kernel/aperture_64.c
@@ -83,7 +83,7 @@ static u32 __init allocate_aperture(void)
 * so don't use 512M below as gart iommu, leave the space for kernel
 * code for safe
 */
-   addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<20);
+   addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<21);
if (addr == MEMBLOCK_ERROR || addr + aper_size > 0x) {
printk(KERN_ERR
"Cannot allocate aperture memory hole (%lx,%uK)\n",
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Linux 2.6.39-rc3

2011-04-13 Thread Linus Torvalds

On Wed, Apr 13, 2011 at 1:48 PM, Yinghai Lu  wrote:
>
> can you try following change ? it will push gart to 0x8000
>
> diff --git a/arch/x86/kernel/aperture_64.c b/arch/x86/kernel/aperture_64.c
> index 86d1ad4..3b6a9d5 100644
> --- a/arch/x86/kernel/aperture_64.c
> +++ b/arch/x86/kernel/aperture_64.c
> @@ -83,7 +83,7 @@ static u32 __init allocate_aperture(void)
>         * so don't use 512M below as gart iommu, leave the space for kernel
>         * code for safe
>         */
> -       addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<20);
> +       addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<21);

What are all the magic numbers, and why would 0x8000 be special?

Why don't we write code that just works?

Or absent a "just works" set of patches, why don't we revert to code
that has years of testing?

This kind of "I broke things, so now I will jiggle things randomly
until they unbreak" is not acceptable.

Either explain why that fixes a real BUG (and why the magic constants
need to be what they are), or just revert the patch that caused the
problem, and go back to the allocation patters that have years of
experience.

Guys, we've had this discussion before, in PCI allocation. We don't do
this. We tried switching the PCI region allocations to top-down, and
IT WAS A FAILURE. We reverted it to what we had years of testing with.

Don't just make random changes. There really are only two acceptable
models of development: "think and analyze" or "years and years of
testing on thousands of machines". Those two really do work.

   Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Linux 2.6.39-rc3

2011-04-13 Thread Yinghai Lu

On 04/13/2011 01:54 PM, Linus Torvalds wrote:
> On Wed, Apr 13, 2011 at 1:48 PM, Yinghai Lu  wrote:
>>
>> can you try following change ? it will push gart to 0x8000
>>
>> diff --git a/arch/x86/kernel/aperture_64.c b/arch/x86/kernel/aperture_64.c
>> index 86d1ad4..3b6a9d5 100644
>> --- a/arch/x86/kernel/aperture_64.c
>> +++ b/arch/x86/kernel/aperture_64.c
>> @@ -83,7 +83,7 @@ static u32 __init allocate_aperture(void)
>> * so don't use 512M below as gart iommu, leave the space for kernel
>> * code for safe
>> */
>> -   addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<20);
>> +   addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<21);
> 
> What are all the magic numbers, and why would 0x8000 be special?

that is the old value when kernel was doing bottom-up bootmem allocation.

> 
> Why don't we write code that just works?
> 
> Or absent a "just works" set of patches, why don't we revert to code
> that has years of testing?
> 
> This kind of "I broke things, so now I will jiggle things randomly
> until they unbreak" is not acceptable.
> 
> Either explain why that fixes a real BUG (and why the magic constants
> need to be what they are), or just revert the patch that caused the
> problem, and go back to the allocation patters that have years of
> experience.
> 
> Guys, we've had this discussion before, in PCI allocation. We don't do
> this. We tried switching the PCI region allocations to top-down, and
> IT WAS A FAILURE. We reverted it to what we had years of testing with.
> 
> Don't just make random changes. There really are only two acceptable
> models of development: "think and analyze" or "years and years of
> testing on thousands of machines". Those two really do work.

We did do the analyzing, and only difference seems to be:
good one is using 0x8000
and bad one is using 0xa000.

We try to figure out if it needs low address and it happen to work 
because kernel was doing bottom up allocation.

Thanks

Yinghai
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Linux 2.6.39-rc3

2011-04-13 Thread Joerg Roedel

On Wed, Apr 13, 2011 at 01:48:48PM -0700, Yinghai Lu wrote:
> - addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<20);
> + addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<21);

Btw, while looking at this code I wondered why the 512M goal is enforced
by the alignment. Start could be set to 512M instead and the alignment
can be aper_size as it should. Any reason for such a big alignment?

Joerg

P.S.: The box is still in the office, I will try this debug-patch
  tomorrow.

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Linux 2.6.39-rc3

2011-04-13 Thread Yinghai Lu

On 04/13/2011 02:50 PM, Joerg Roedel wrote:
> On Wed, Apr 13, 2011 at 01:48:48PM -0700, Yinghai Lu wrote:
>> -addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<20);
>> +addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<21);
> 
> Btw, while looking at this code I wondered why the 512M goal is enforced
> by the alignment. Start could be set to 512M instead and the alignment
> can be aper_size as it should. Any reason for such a big alignment?
> 

when using bootmem, try to use big alignment (512M ), so we could avoid take 
ram range below 512M.

commit 7677b2ef6c0c4fddc84f6473f3863f40eb71821b
Author: Yinghai Lu 
Date:   Mon Apr 14 20:40:37 2008 -0700

x86_64: allocate gart aperture from 512M

because we try to reserve dma32 early, so we have chance to get aperture
from 64M.

with some sequence aperture allocated from RAM, could become E820_RESERVED.

and then if doing a kexec with a big kernel that uncompressed size is above
64M we could have a range conflict with still using gart.

So allocate gart aperture from 512M instead.

Also change the fallback_aper_order to 5, because we don't have chance to 
get
2G or 4G aperture.

We can change it back to 32M or make it equal to size.

> 
> P.S.: The box is still in the office, I will try this debug-patch
>   tomorrow.

Alexandre's system is working at 0xa400 with 2.6.38.2

So it is not low address problem. could be other reason like
some other code could need lower address.

Thanks

Yinghai
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Linux 2.6.39-rc3

2011-04-13 Thread H. Peter Anvin

On 04/13/2011 02:50 PM, Joerg Roedel wrote:
> On Wed, Apr 13, 2011 at 01:48:48PM -0700, Yinghai Lu wrote:
>> -addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<20);
>> +addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<21);
> 
> Btw, while looking at this code I wondered why the 512M goal is enforced
> by the alignment. Start could be set to 512M instead and the alignment
> can be aper_size as it should. Any reason for such a big alignment?
> 
>   Joerg
> 
> P.S.: The box is still in the office, I will try this debug-patch
>   tomorrow.

The only reason that I can think of is that the aperture itself can be
huge, and perhaps 512 MiB is the biggest such known.  512ULL<<21 is of
course a particularly moronic way to write 1 GiB, but it was a debug patch.

The value 512 MiB apparently comes from
7677b2ef6c0c4fddc84f6473f3863f40eb71821b, which is apparently totally ad
hoc; effectively it tries to prevent a collision with kexec by
hardcoding the kdump allocation as it sat at that point in time in the
GART assignment rules.

Yeah.  Brilliant.

-hpa

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Linux 2.6.39-rc3

2011-04-13 Thread H. Peter Anvin

On 04/13/2011 02:59 PM, Yinghai Lu wrote:
> On 04/13/2011 02:50 PM, Joerg Roedel wrote:
>> On Wed, Apr 13, 2011 at 01:48:48PM -0700, Yinghai Lu wrote:
>>> -   addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<20);
>>> +   addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<21);
>>
>> Btw, while looking at this code I wondered why the 512M goal is enforced
>> by the alignment. Start could be set to 512M instead and the alignment
>> can be aper_size as it should. Any reason for such a big alignment?
>>
> 
> when using bootmem, try to use big alignment (512M ), so we could avoid take 
> ram range below 512M.
> 

Yes, his question was why on Earth are you using 0 as start if that is
the purpose.

On top of that, where the hell does the magic 512 MiB come from?  It
looks like it is either completly ad hoc, or it has something to do with
where the kexec kernel was allocated once upon a time.

-hpa
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Linux 2.6.39-rc3

2011-04-13 Thread Joerg Roedel

On Wed, Apr 13, 2011 at 03:01:10PM -0700, H. Peter Anvin wrote:
> On 04/13/2011 02:50 PM, Joerg Roedel wrote:
> > On Wed, Apr 13, 2011 at 01:48:48PM -0700, Yinghai Lu wrote:
> >> -  addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<20);
> >> +  addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<21);
> > 
> > Btw, while looking at this code I wondered why the 512M goal is enforced
> > by the alignment. Start could be set to 512M instead and the alignment
> > can be aper_size as it should. Any reason for such a big alignment?
> > 
> > Joerg
> > 
> > P.S.: The box is still in the office, I will try this debug-patch
> >   tomorrow.
> 
> The only reason that I can think of is that the aperture itself can be
> huge, and perhaps 512 MiB is the biggest such known. 

Well, that would work as well by just using aper_size as alignment, the
aperture needs to be aligned on its size anyway. This code only runs
when Linux allocates the aperture itself and if I am mistaken is uses
always 64MB when doing this.

Joerg

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Linux 2.6.39-rc3

2011-04-13 Thread H. Peter Anvin

On 04/13/2011 03:22 PM, Joerg Roedel wrote:
> On Wed, Apr 13, 2011 at 03:01:10PM -0700, H. Peter Anvin wrote:
>> On 04/13/2011 02:50 PM, Joerg Roedel wrote:
>>> On Wed, Apr 13, 2011 at 01:48:48PM -0700, Yinghai Lu wrote:
 -  addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<20);
 +  addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<21);
>>>
>>> Btw, while looking at this code I wondered why the 512M goal is enforced
>>> by the alignment. Start could be set to 512M instead and the alignment
>>> can be aper_size as it should. Any reason for such a big alignment?
>>>
>>> Joerg
>>>
>>> P.S.: The box is still in the office, I will try this debug-patch
>>>   tomorrow.
>>
>> The only reason that I can think of is that the aperture itself can be
>> huge, and perhaps 512 MiB is the biggest such known. 
> 
> Well, that would work as well by just using aper_size as alignment, the
> aperture needs to be aligned on its size anyway. This code only runs
> when Linux allocates the aperture itself and if I am mistaken is uses
> always 64MB when doing this.

Yes, I would agree with that.  The sane thing would be to set the base
to whatever address needs to be guarded against (WHICH SHOULD BE
MOTIVATED), and use aper_size as alignment, *unless* we are only using
the initial portion of a much larger hardware structure that needs
natural alignment (which isn't clear to me, I do know we sometimes use
only a fraction of the GART, but that doesn't mean we need to
naturally-align the entire thing, nor that 512 MiB is sufficient to do so.)

-hpa


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Linux 2.6.39-rc3

2011-04-13 Thread Linus Torvalds

On Wed, Apr 13, 2011 at 2:23 PM, Yinghai Lu  wrote:
>>
>> What are all the magic numbers, and why would 0x8000 be special?
>
> that is the old value when kernel was doing bottom-up bootmem allocation.

I understand, BUT THAT IS STILL A TOTALLY MAGIC NUMBER!

It makes it come out the same ON THAT ONE MACHINE.  So no, it's not
"the old value". It's a random value that gets the old value in one
specific case.

>> Why don't we write code that just works?
>>
>> Or absent a "just works" set of patches, why don't we revert to code
>> that has years of testing?
>>
>> This kind of "I broke things, so now I will jiggle things randomly
>> until they unbreak" is not acceptable.
>>
>> Either explain why that fixes a real BUG (and why the magic constants
>> need to be what they are), or just revert the patch that caused the
>> problem, and go back to the allocation patters that have years of
>> experience.
>>
>> Guys, we've had this discussion before, in PCI allocation. We don't do
>> this. We tried switching the PCI region allocations to top-down, and
>> IT WAS A FAILURE. We reverted it to what we had years of testing with.
>>
>> Don't just make random changes. There really are only two acceptable
>> models of development: "think and analyze" or "years and years of
>> testing on thousands of machines". Those two really do work.
>
> We did do the analyzing, and only difference seems to be:

No.

Yinghai, we have had this discussion before, and dammit, you need to
understand the difference between "understanding the problem" and "put
in random values until it works on one machine".

There was absolutely _zero_ analysis done. You do not actually
understand WHY the numbers matter. You just look at two random
numbers, and one works, the other does not. That's not "analyzing".
That's just "random number games".

If you cannot see and understand the difference between an actual
analytical solution where you _understand_ what the code is doing and
why, and "random numbers that happen to work on one machine", I don't
know what to tell you.

> good one is using 0x8000
> and bad one is using 0xa000.
>
> We try to figure out if it needs low address and it happen to work
> because kernel was doing bottom up allocation.

No.

Let me repeat my point one more time.

You have TWO choices. Not more, not less:

 - choice #1: go back to the old allocation model. It's tested. It
doesn't regress. Admittedly we may not know exactly _why_ it works,
and it might not work on all machines, but it doesn't cause
regressions (ie the machines it doesn't work on it _never_ worked on).

   And this doesn't mean "old value for that _one_ machine". It means
"old value for _every_ machine". So it means we revert the whole
bottom-down thing entirely. Not just "change one random number so that
the totally different allocation pattern happens to give the same
result on one particular machine".

   Quite frankly, I don't see the point of doing top-to-bottom anyway,
so I think we should do this regardless. Just revert the whole
"allocate from top". It didn't work for PCI, it's not working for this
case either. Stop doing it.

 - Choice #2: understand exactly _what_ goes wrong, and fix it
analytically (ie by _understanding_ the problem, and being able to
solve it exactly, and in a way you can argue about without having to
resort to "magic happens").

Now, the whole analytic approach (aka "computer sciency" approach),
where you can actually think about the problem without having any
pesky "reality" impact the solution is obviously the one we tend to
prefer. Sadly, it's seldom the one we can use in reality when it comes
to things like resource allocation, since we end up starting off with
often buggy approximations of what the actual hardware is all about
(ie broken firmware tables).

So I'd love to know exactly why one random number works, and why
another one doesn't. But as long as we do _not_ know the "Why" of it,
we will have to revert.

It really is that simple. It's _always_ that simple.

So the numbers shouldn't be "magic", they should have real
explanations. And in the absense of real explanation, the model that
works is "this is what we've always done". Including, very much, the
whole allocation order. Not just one random number on one random
machine.

Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Linux 2.6.39-rc3

2011-04-13 Thread Yinghai Lu

On 04/13/2011 04:39 PM, Linus Torvalds wrote:
> On Wed, Apr 13, 2011 at 2:23 PM, Yinghai Lu  wrote:
>>>
>>> What are all the magic numbers, and why would 0x8000 be special?
>>
>> that is the old value when kernel was doing bottom-up bootmem allocation.
> 
> I understand, BUT THAT IS STILL A TOTALLY MAGIC NUMBER!
> 
> It makes it come out the same ON THAT ONE MACHINE.  So no, it's not
> "the old value". It's a random value that gets the old value in one
> specific case.

Alexandre's system is working 2.6.38.2 and kernel allocate from 0xa400
Joerg's system working 2.6.39-rc3 while revert the top down bootmem patch 
1a4a678b12c84db9ae5dce424e0e97f0559bb57c
and kernel allocate to 0x8000.
Alexandre's system is working while increasing alignment to 1g, and make kernel 
to
allocate 0x8000 to gart.

they are not working if kernel allocate from 0xa000

the 0xa000 looks like same value from radon GTT.


[4.250159] radeon :01:05.0: VRAM: 320M 0xC000 - 
0xD3FF (320M used)
[4.258830] radeon :01:05.0: GTT: 512M 0xA000 - 
0xBFFF
[4.266742] [drm] Detected VRAM RAM=320M, BAR=256M
[4.271549] [drm] RAM width 32bits DDR
[4.275435] [TTM] Zone  kernel: Available graphics memory: 1896526 kiB.
[4.282066] [TTM] Initializing pool allocator.
[4.282085] usb 7-2: new full speed USB device number 2 using ohci_hcd
[4.293076] [drm] radeon: 320M of VRAM memory ready
[4.298277] [drm] radeon: 512M of GTT memory ready.
[4.303218] [drm] Supports vblank timestamp caching Rev 1 (10.10.2010).
[4.309854] [drm] Driver supports precise vblank timestamp query.
[4.315970] [drm] radeon: irq initialized.
[4.320094] [drm] GART: num cpu pages 131072, num gpu pages 131072

Alex said that 0xa000 is ok and is from GPU address space
---
The VRAM and GTT addresses in the dmesg are internal GPU addresses not
system addresses.  The GPU has it's own internal address space for
on-chip memory clients (texture samplers, render buffers, display
controllers, etc.).  The GPU sets up two apertures in it's internal
address space and on-chip client requests are forwarded to the
appropriate place by the GPU's memory controller.  Addresses in the
GPU's VRAM aperture go to local vram on discrete cards, or to the
stolen memory at the top of system memory for IGP cards.  Addresses in
the GPU's GTT aperture hit a page table and get forwarded to the
appropriate dma pages.
---

> 
>>> Why don't we write code that just works?
>>>
>>> Or absent a "just works" set of patches, why don't we revert to code
>>> that has years of testing?
>>>
>>> This kind of "I broke things, so now I will jiggle things randomly
>>> until they unbreak" is not acceptable.
>>>
>>> Either explain why that fixes a real BUG (and why the magic constants
>>> need to be what they are), or just revert the patch that caused the
>>> problem, and go back to the allocation patters that have years of
>>> experience.
>>>
>>> Guys, we've had this discussion before, in PCI allocation. We don't do
>>> this. We tried switching the PCI region allocations to top-down, and
>>> IT WAS A FAILURE. We reverted it to what we had years of testing with.
>>>
>>> Don't just make random changes. There really are only two acceptable
>>> models of development: "think and analyze" or "years and years of
>>> testing on thousands of machines". Those two really do work.
>>
>> We did do the analyzing, and only difference seems to be:
> 
> No.
> 
> Yinghai, we have had this discussion before, and dammit, you need to
> understand the difference between "understanding the problem" and "put
> in random values until it works on one machine".
> 
> There was absolutely _zero_ analysis done. You do not actually
> understand WHY the numbers matter. You just look at two random
> numbers, and one works, the other does not. That's not "analyzing".
> That's just "random number games".
> 
> If you cannot see and understand the difference between an actual
> analytical solution where you _understand_ what the code is doing and
> why, and "random numbers that happen to work on one machine", I don't
> know what to tell you.
> 
>> good one is using 0x8000
>> and bad one is using 0xa000.
>>
>> We try to figure out if it needs low address and it happen to work
>> because kernel was doing bottom up allocation.
> 
> No.
> 
> Let me repeat my point one more time.
> 
> You have TWO choices. Not more, not less:
> 
>  - choice #1: go back to the old allocation model. It's tested. It
> doesn't regress. Admittedly we may not know exactly _why_ it works,
> and it might not work on all machines, but it doesn't cause
> regressions (ie the machines it doesn't work on it _never_ worked on).
> 
>And this doesn't mean "old value for that _one_ machine". It means
> "old value for _every_ machine". So it means we revert the whole
> bottom-down thing entirely. Not just "change one random number so that
> the totally different

Re: Linux 2.6.39-rc3

2011-04-13 Thread H. Peter Anvin

On 04/13/2011 12:14 PM, Yinghai Lu wrote:
> 
> so those two patches uncover some problems.
> 
> [0.00] Checking aperture...
> [0.00] No AGP bridge found
> [0.00] Node 0: aperture @ a000 size 32 MB
> [0.00] Aperture pointing to e820 RAM. Ignoring.
> [0.00] Your BIOS doesn't leave a aperture memory hole
> [0.00] Please enable the IOMMU option in the BIOS setup
> [0.00] This costs you 64 MB of RAM
> [0.00] memblock_x86_reserve_range: [0xa000-0xa3ff]   
> aperture64
> [0.00] Mapping aperture over 65536 KB of RAM @ a000
> 
> so kernel try to reallocate apperture. because BIOS allocated is pointed to 
> RAM or size is too small.
> 
> but your radeon does use [0xa000, 0xbfff)
> 
> [4.281993] radeon :01:05.0: VRAM: 320M 0xC000 - 
> 0xD3FF (320M used)
> [4.290672] radeon :01:05.0: GTT: 512M 0xA000 - 
> 0xBFFF
> [4.298550] [drm] Detected VRAM RAM=320M, BAR=256M
> [4.309857] [drm] RAM width 32bits DDR
> [4.313748] [TTM] Zone  kernel: Available graphics memory: 1896524 kiB.
> [4.320379] [TTM] Initializing pool allocator.
> [4.324948] [drm] radeon: 320M of VRAM memory ready
> [4.329832] [drm] radeon: 512M of GTT memory ready.
> 
> and the one seems working:
> 
> [0.00] Checking aperture...
> [0.00] No AGP bridge found
> [0.00] Node 0: aperture @ a000 size 32 MB
> [0.00] Aperture pointing to e820 RAM. Ignoring.
> [0.00] Your BIOS doesn't leave a aperture memory hole
> [0.00] Please enable the IOMMU option in the BIOS setup
> [0.00] This costs you 64 MB of RAM
> [0.00] memblock_x86_reserve_range: [0x8000-0x83ff]   
> aperture64
> [0.00] Mapping aperture over 65536 KB of RAM @ 8000
> [0.00] memblock_x86_reserve_range: [0xacb6bdc0-0xacb6bddf]
>   BOOTMEM
> 
> will use different position...
> 
> [4.250159] radeon :01:05.0: VRAM: 320M 0xC000 - 
> 0xD3FF (320M used)
> [4.258830] radeon :01:05.0: GTT: 512M 0xA000 - 
> 0xBFFF
> [4.266742] [drm] Detected VRAM RAM=320M, BAR=256M
> [4.271549] [drm] RAM width 32bits DDR
> [4.275435] [TTM] Zone  kernel: Available graphics memory: 1896526 kiB.
> [4.282066] [TTM] Initializing pool allocator.
> [4.282085] usb 7-2: new full speed USB device number 2 using ohci_hcd
> [4.293076] [drm] radeon: 320M of VRAM memory ready
> [4.298277] [drm] radeon: 512M of GTT memory ready.
> [4.303218] [drm] Supports vblank timestamp caching Rev 1 (10.10.2010).
> [4.309854] [drm] Driver supports precise vblank timestamp query.
> [4.315970] [drm] radeon: irq initialized.
> [4.320094] [drm] GART: num cpu pages 131072, num gpu pages 131072
> 
> So question is why radeon is using the address [0xa000 - 0xc00], and 
> in E820 it is RAM 
> 
> [0.00]  BIOS-e820: 0010 - acb8d000 (usable)
> [0.00]  BIOS-e820: acb8d000 - acb8f000 (reserved)
> [0.00]  BIOS-e820: acb8f000 - afce9000 (usable)
> [0.00]  BIOS-e820: afce9000 - afd21000 (reserved)
> [0.00]  BIOS-e820: afd21000 - afd4f000 (usable)
> [0.00]  BIOS-e820: afd4f000 - afdcf000 (reserved)
> [0.00]  BIOS-e820: afdcf000 - afecf000 (ACPI NVS)
> [0.00]  BIOS-e820: afecf000 - afeff000 (ACPI data)
> [0.00]  BIOS-e820: afeff000 - aff0 (usable)
> 
> so looks bios program wrong address to the radon card?
> 

Okay, staring at this, it definitely seems toxic to overlay the GART
over memory areas reserved by the BIOS.  If I were to guess, I would say
that the problem here seems to be that the kernel thinks it is
overlaying 64 MiB of memory, but the actual GART is in fact 512 MiB in
size -- 131072 CPU pages -- which now overlaps the BIOS reserved areas.

Alex D., could you comment on the "num cpu pages" bit?

-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Linux 2.6.39-rc3

2011-04-13 Thread H. Peter Anvin

On 04/13/2011 04:39 PM, Linus Torvalds wrote:
> 
>  - Choice #2: understand exactly _what_ goes wrong, and fix it
> analytically (ie by _understanding_ the problem, and being able to
> solve it exactly, and in a way you can argue about without having to
> resort to "magic happens").
> 
> Now, the whole analytic approach (aka "computer sciency" approach),
> where you can actually think about the problem without having any
> pesky "reality" impact the solution is obviously the one we tend to
> prefer. Sadly, it's seldom the one we can use in reality when it comes
> to things like resource allocation, since we end up starting off with
> often buggy approximations of what the actual hardware is all about
> (ie broken firmware tables).
> 
> So I'd love to know exactly why one random number works, and why
> another one doesn't. But as long as we do _not_ know the "Why" of it,
> we will have to revert.
> 

Yes.  However, even if we *do* revert (and the time is running short on
not reverting) I would like to understand this particular one, simply
because I think it may very well be a problem that is manifesting itself
in other ways on other systems.

The other thing that this has uncovered is that we already have a bunch
of complete b*llsh*t magic numbers in this path, some of which are
trivially shown to be wrong or at least completely arbitrary, so there
are more issues here :(

-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Linux 2.6.39-rc3

2011-04-13 Thread Dave Airlie

On Wed, 2011-04-13 at 18:58 -0700, H. Peter Anvin wrote:
> On 04/13/2011 12:14 PM, Yinghai Lu wrote:
> > 
> > so those two patches uncover some problems.
> > 
> > [0.00] Checking aperture...
> > [0.00] No AGP bridge found
> > [0.00] Node 0: aperture @ a000 size 32 MB
> > [0.00] Aperture pointing to e820 RAM. Ignoring.
> > [0.00] Your BIOS doesn't leave a aperture memory hole
> > [0.00] Please enable the IOMMU option in the BIOS setup
> > [0.00] This costs you 64 MB of RAM
> > [0.00] memblock_x86_reserve_range: [0xa000-0xa3ff]  
> >  aperture64
> > [0.00] Mapping aperture over 65536 KB of RAM @ a000
> > 
> > so kernel try to reallocate apperture. because BIOS allocated is pointed to 
> > RAM or size is too small.
> > 
> > but your radeon does use [0xa000, 0xbfff)
> > 
> > [4.281993] radeon :01:05.0: VRAM: 320M 0xC000 - 
> > 0xD3FF (320M used)
> > [4.290672] radeon :01:05.0: GTT: 512M 0xA000 - 
> > 0xBFFF
> > [4.298550] [drm] Detected VRAM RAM=320M, BAR=256M
> > [4.309857] [drm] RAM width 32bits DDR
> > [4.313748] [TTM] Zone  kernel: Available graphics memory: 1896524 kiB.
> > [4.320379] [TTM] Initializing pool allocator.
> > [4.324948] [drm] radeon: 320M of VRAM memory ready
> > [4.329832] [drm] radeon: 512M of GTT memory ready.
> > 
> > and the one seems working:
> > 
> > [0.00] Checking aperture...
> > [0.00] No AGP bridge found
> > [0.00] Node 0: aperture @ a000 size 32 MB
> > [0.00] Aperture pointing to e820 RAM. Ignoring.
> > [0.00] Your BIOS doesn't leave a aperture memory hole
> > [0.00] Please enable the IOMMU option in the BIOS setup
> > [0.00] This costs you 64 MB of RAM
> > [0.00] memblock_x86_reserve_range: [0x8000-0x83ff]  
> >  aperture64
> > [0.00] Mapping aperture over 65536 KB of RAM @ 8000
> > [0.00] memblock_x86_reserve_range: [0xacb6bdc0-0xacb6bddf]  
> > BOOTMEM
> > 
> > will use different position...
> > 
> > [4.250159] radeon :01:05.0: VRAM: 320M 0xC000 - 
> > 0xD3FF (320M used)
> > [4.258830] radeon :01:05.0: GTT: 512M 0xA000 - 
> > 0xBFFF
> > [4.266742] [drm] Detected VRAM RAM=320M, BAR=256M
> > [4.271549] [drm] RAM width 32bits DDR
> > [4.275435] [TTM] Zone  kernel: Available graphics memory: 1896526 kiB.
> > [4.282066] [TTM] Initializing pool allocator.
> > [4.282085] usb 7-2: new full speed USB device number 2 using ohci_hcd
> > [4.293076] [drm] radeon: 320M of VRAM memory ready
> > [4.298277] [drm] radeon: 512M of GTT memory ready.
> > [4.303218] [drm] Supports vblank timestamp caching Rev 1 (10.10.2010).
> > [4.309854] [drm] Driver supports precise vblank timestamp query.
> > [4.315970] [drm] radeon: irq initialized.
> > [4.320094] [drm] GART: num cpu pages 131072, num gpu pages 131072
> > 
> > So question is why radeon is using the address [0xa000 - 0xc00], 
> > and in E820 it is RAM 
> > 
> > [0.00]  BIOS-e820: 0010 - acb8d000 (usable)
> > [0.00]  BIOS-e820: acb8d000 - acb8f000 (reserved)
> > [0.00]  BIOS-e820: acb8f000 - afce9000 (usable)
> > [0.00]  BIOS-e820: afce9000 - afd21000 (reserved)
> > [0.00]  BIOS-e820: afd21000 - afd4f000 (usable)
> > [0.00]  BIOS-e820: afd4f000 - afdcf000 (reserved)
> > [0.00]  BIOS-e820: afdcf000 - afecf000 (ACPI NVS)
> > [0.00]  BIOS-e820: afecf000 - afeff000 (ACPI data)
> > [0.00]  BIOS-e820: afeff000 - aff0 (usable)
> > 
> > so looks bios program wrong address to the radon card?
> > 
> 
> Okay, staring at this, it definitely seems toxic to overlay the GART
> over memory areas reserved by the BIOS.  If I were to guess, I would say
> that the problem here seems to be that the kernel thinks it is
> overlaying 64 MiB of memory, but the actual GART is in fact 512 MiB in
> size -- 131072 CPU pages -- which now overlaps the BIOS reserved areas.
> 
> Alex D., could you comment on the "num cpu pages" bit?

These are not CPU addresses. I think we've stated that already. Not the
droids.

the num cpu pages is how many CPU pages would be needed to fill the GPU
GTT, for those crazy cases where CPU pagesize != GPU pagesize.

Dave.


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Linux 2.6.39-rc3

2011-04-13 Thread Linus Torvalds

On Wednesday, April 13, 2011, H. Peter Anvin  wrote:
>
> Yes.  However, even if we *do* revert (and the time is running short on
> not reverting) I would like to understand this particular one, simply
> because I think it may very well be a problem that is manifesting itself
> in other ways on other systems.
>
> The other thing that this has uncovered is that we already have a bunch
> of complete b*llsh*t magic numbers in this
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Linux 2.6.39-rc3

2011-04-13 Thread Linus Torvalds

On Wednesday, April 13, 2011, Linus Torvalds
 wrote:
> On Wednesday, April 13, 2011, H. Peter Anvin  wrote:
>>
>> Yes.  However, even if we *do* revert (and the time is running short on
>> not reverting) I would like to understand this particular one, simply
>> because I think it may very well be a problem that is manifesting itself
>> in other ways on other systems.

 sorry, fingerfart. Anyway, I agree 100%.

 we definitely want to also understand the reason for things not
working, even if we do revert..

Linus
>> of complete b*llsh*t magic numbers in this
>
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Linux 2.6.39-rc3

2011-04-13 Thread Tejun Heo

Hello,

On Wed, Apr 13, 2011 at 07:33:40PM -0700, Linus Torvalds wrote:
> On Wednesday, April 13, 2011, Linus Torvalds
>  wrote:
> > On Wednesday, April 13, 2011, H. Peter Anvin  wrote:
> >>
> >> Yes.  However, even if we *do* revert (and the time is running short on
> >> not reverting) I would like to understand this particular one, simply
> >> because I think it may very well be a problem that is manifesting itself
> >> in other ways on other systems.
> 
>  sorry, fingerfart. Anyway, I agree 100%.
> 
>  we definitely want to also understand the reason for things not
> working, even if we do revert..

There were (and still are) places where memblock callers implemented
ad-hoc top-down allocation by stepping down start limit until
allocation succeeds.  Several of them have been removed since top-down
became the default behavior, so simply reverting the commit is likely
to cause subtle issues.  Maybe the best approach is introducing
@topdown parameter and use it selectively for pure memory allocations.

Thanks.

-- 
tejun
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 28627] 2.6.31.6 is the last kernel where KMS works well on an RV515 card for regular PCI

2011-04-13 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=28627

--- Comment #21 from Connor Behan  2011-04-13 21:40:10 
PDT ---
This bug largely goes away if I use kernels 2.6.37 and 2.6.38 with the Gallium
Radeon/DRI driver. In fact the glxgears framerates I get that way are slightly
better. Some things to note are that the framerates become awful again if I
turn "EXAPixmaps" "off" and that I still have trouble logging out of X. This is
surely a topic for another bug.

Thanks for all the work you've been doing!

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 35312] r600g: Automatic mipmap generation doesn't work properly

2011-04-13 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=35312

--- Comment #1 from Francis Whittle  2011-04-13 22:36:09 
PDT ---
Created an attachment (id=45598)
 View: https://bugs.freedesktop.org/attachment.cgi?id=45598
 Review: https://bugs.freedesktop.org/review?bug=35312&attachment=45598

short patch to test problem

Can you try this patch to mesa and say if it fixes the issue?

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Linux 2.6.39-rc3

2011-04-13 Thread H. Peter Anvin

On 04/13/2011 07:07 PM, Dave Airlie wrote:
>>
>> Okay, staring at this, it definitely seems toxic to overlay the GART
>> over memory areas reserved by the BIOS.  If I were to guess, I would say
>> that the problem here seems to be that the kernel thinks it is
>> overlaying 64 MiB of memory, but the actual GART is in fact 512 MiB in
>> size -- 131072 CPU pages -- which now overlaps the BIOS reserved areas.
>>
>> Alex D., could you comment on the "num cpu pages" bit?
> 
> These are not CPU addresses. I think we've stated that already. Not the
> droids.
> 
> the num cpu pages is how many CPU pages would be needed to fill the GPU
> GTT, for those crazy cases where CPU pagesize != GPU pagesize.
> 

OK, well, something is still weird.

-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH] drm/i915: restore only the mode of this driver on lastclose (v2)

2011-04-13 Thread Dave Airlie

From: Dave Airlie 

i915 calls the panic handler function on last close to reset the modes,
however this is a really bad idea for multi-gpu machines, esp shareable
gpus machines. So add a new entry point for the driver to just restore
its own fbcon mode.

v2: move code into fb helper, fix panic code to block mode change on
powered off GPUs.

Signed-off-by: Dave Airlie 
---
 drivers/gpu/drm/drm_fb_helper.c  |   27 ---
 drivers/gpu/drm/i915/i915_dma.c  |2 +-
 drivers/gpu/drm/i915/intel_drv.h |1 +
 drivers/gpu/drm/i915/intel_fb.c  |   10 ++
 include/drm/drm_fb_helper.h  |1 +
 5 files changed, 33 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
index 9507204..11d7a72 100644
--- a/drivers/gpu/drm/drm_fb_helper.c
+++ b/drivers/gpu/drm/drm_fb_helper.c
@@ -342,9 +342,22 @@ int drm_fb_helper_debug_leave(struct fb_info *info)
 }
 EXPORT_SYMBOL(drm_fb_helper_debug_leave);

+bool drm_fb_helper_restore_fbdev_mode(struct drm_fb_helper *fb_helper)
+{
+   bool error = false;
+   int i, ret;
+   for (i = 0; i < fb_helper->crtc_count; i++) {
+   struct drm_mode_set *mode_set = 
&fb_helper->crtc_info[i].mode_set;
+   ret = drm_crtc_helper_set_config(mode_set);
+   if (ret)
+   error = true;
+   }
+   return error;
+}
+EXPORT_SYMBOL(drm_fb_helper_restore_fbdev_mode);
+
 bool drm_fb_helper_force_kernel_mode(void)
 {
-   int i = 0;
bool ret, error = false;
struct drm_fb_helper *helper;

@@ -352,12 +365,12 @@ bool drm_fb_helper_force_kernel_mode(void)
return false;

list_for_each_entry(helper, &kernel_fb_helper_list, kernel_fb_list) {
-   for (i = 0; i < helper->crtc_count; i++) {
-   struct drm_mode_set *mode_set = 
&helper->crtc_info[i].mode_set;
-   ret = drm_crtc_helper_set_config(mode_set);
-   if (ret)
-   error = true;
-   }
+   if (helper->dev->switch_power_state == DRM_SWITCH_POWER_OFF)
+   continue;
+
+   ret = drm_fb_helper_restore_fbdev_mode(helper);
+   if (ret)
+   error = true;
}
return error;
 }
diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 7273037..12876f2 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -2207,7 +2207,7 @@ void i915_driver_lastclose(struct drm_device * dev)
drm_i915_private_t *dev_priv = dev->dev_private;

if (!dev_priv || drm_core_check_feature(dev, DRIVER_MODESET)) {
-   drm_fb_helper_restore();
+   intel_fb_restore_mode(dev);
vga_switcheroo_process_delayed_switch();
return;
}
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index f5b0d83..1d20712 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -338,4 +338,5 @@ extern int intel_overlay_attrs(struct drm_device *dev, void 
*data,
   struct drm_file *file_priv);

 extern void intel_fb_output_poll_changed(struct drm_device *dev);
+extern void intel_fb_restore_mode(struct drm_device *dev);
 #endif /* __INTEL_DRV_H__ */
diff --git a/drivers/gpu/drm/i915/intel_fb.c b/drivers/gpu/drm/i915/intel_fb.c
index 5127827..ec49bae 100644
--- a/drivers/gpu/drm/i915/intel_fb.c
+++ b/drivers/gpu/drm/i915/intel_fb.c
@@ -264,3 +264,13 @@ void intel_fb_output_poll_changed(struct drm_device *dev)
drm_i915_private_t *dev_priv = dev->dev_private;
drm_fb_helper_hotplug_event(&dev_priv->fbdev->helper);
 }
+
+void intel_fb_restore_mode(struct drm_device *dev)
+{
+   int ret;
+   drm_i915_private_t *dev_priv = dev->dev_private;
+
+   ret = drm_fb_helper_restore_fbdev_mode(&dev_priv->fbdev->helper);
+   if (ret)
+   DRM_DEBUG("failed to restore crtc mode\n");
+}
diff --git a/include/drm/drm_fb_helper.h b/include/drm/drm_fb_helper.h
index f22e7fe..ade09d7 100644
--- a/include/drm/drm_fb_helper.h
+++ b/include/drm/drm_fb_helper.h
@@ -118,6 +118,7 @@ int drm_fb_helper_setcolreg(unsigned regno,
unsigned transp,
struct fb_info *info);

+bool drm_fb_helper_restore_fbdev_mode(struct drm_fb_helper *fb_helper);
 void drm_fb_helper_restore(void);
 void drm_fb_helper_fill_var(struct fb_info *info, struct drm_fb_helper 
*fb_helper,
uint32_t fb_width, uint32_t fb_height);
-- 
1.7.1

[git pull] drm fixes

2011-04-13 Thread Dave Airlie


Hi Linus,

This should have gone out a few days ago, but I was trapped watching 
Disney shows with my daughter at home and I wanted to check it on a few 
more machines,

Its got two reverts, one for a change I pushed out by accident to -fixes, 
the other for a Xen/TTM change, that looks to be causing non-Xen 
problems so punting on it for now. The rest is mostly nouveau + radeon 
fixes, the radeon ones fix a few regressions and stability problems on 
newer cards.

I suspect I'll have a few more intel fixes and v2 of the i915 patch I 
reverted out of this pull, it fixes a problem on the dual-gpu laptops reported 
a 
long while ago.

The following changes since commit 94c8a984ae2adbd9a9626fb42e0f2faf3e36e86f:

  Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6 
(2011-04-08 11:47:35 -0700)

are available in the git repository at:

  ssh://master.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6.git drm-fixes

Alex Deucher (7):
  drm/radeon/kms: pll tweaks for rv6xx
  drm/radeon/kms: make radeon i2c put/get bytes less noisy
  drm/radeon/kms: clean up gart dummy page handling
  drm/radeon/kms: fix suspend on rv530 asics
  drm/radeon/kms: fix pcie_p callbacks on btc and cayman
  drm/radeon/kms: add voltage type to atom set voltage function
  drm/radeon/kms: properly program vddci on evergreen+

Ben Skeggs (5):
  drm/nouveau: implement init table opcode 0x5c
  drm/nouveau: quirk for XFX GT-240X-YA
  drm/nv50: use "nv86" tlb flush method on everything except 0x50/0xac
  drm/nv50-nvc0: remove some code that doesn't belong here
  drm/nvc0: improve vm flush function

Dave Airlie (4):
  i915: restore only the mode of this driver on lastclose
  Merge remote branch 'nouveau/drm-nouveau-fixes' of 
/ssd/git/drm-nouveau-next into drm-fixes
  Revert "ttm: Utilize the DMA API for pages that have TTM_PAGE_FLAG_DMA32 
set."
  Revert "i915: restore only the mode of this driver on lastclose"

David Dillow (1):
  drm/nv50-nvc0: work around an evo channel hang that some people see

Emil Velikov (1):
  nv30: Fix parsing of perf table

Konstantin Khlebnikov (1):
  i915: select VIDEO_OUTPUT_CONTROL for ACPI_VIDEO

Marcin Slusarz (1):
  drm/nouveau: fix oops on unload with disabled LVDS panel

Michel D?nzer (2):
  radeon: Fix KMS CP writeback on big endian machines.
  drm/radeon: Fix KMS legacy backlight support if 
CONFIG_BACKLIGHT_CLASS_DEVICE=m.

Roy Spliet (1):
  drm/nouveau: correct memtiming table parsing for nv4x

 drivers/gpu/drm/Kconfig |1 +
 drivers/gpu/drm/nouveau/nouveau_bios.c  |   53 +++-
 drivers/gpu/drm/nouveau/nouveau_drv.h   |2 +-
 drivers/gpu/drm/nouveau/nouveau_mem.c   |   76 +++
 drivers/gpu/drm/nouveau/nouveau_perf.c  |2 +-
 drivers/gpu/drm/nouveau/nouveau_state.c |   12 +---
 drivers/gpu/drm/nouveau/nv04_dfp.c  |   13 ++--
 drivers/gpu/drm/nouveau/nv50_crtc.c |3 -
 drivers/gpu/drm/nouveau/nv50_evo.c  |1 +
 drivers/gpu/drm/nouveau/nv50_graph.c|2 +-
 drivers/gpu/drm/nouveau/nvc0_vm.c   |   24 +---
 drivers/gpu/drm/radeon/atom.c   |6 ++-
 drivers/gpu/drm/radeon/atombios_crtc.c  |6 ++
 drivers/gpu/drm/radeon/evergreen.c  |   17 +++---
 drivers/gpu/drm/radeon/r600.c   |6 +--
 drivers/gpu/drm/radeon/radeon.h |   12 +++-
 drivers/gpu/drm/radeon/radeon_asic.c|2 +-
 drivers/gpu/drm/radeon/radeon_atombios.c|   30 ++---
 drivers/gpu/drm/radeon/radeon_fence.c   |2 +-
 drivers/gpu/drm/radeon/radeon_gart.c|2 +
 drivers/gpu/drm/radeon/radeon_i2c.c |4 +-
 drivers/gpu/drm/radeon/radeon_legacy_encoders.c |2 +-
 drivers/gpu/drm/radeon/radeon_pm.c  |   11 +++-
 drivers/gpu/drm/radeon/radeon_ring.c|2 +-
 drivers/gpu/drm/radeon/rs600.c  |2 +-
 drivers/gpu/drm/radeon/rv770.c  |6 +--
 drivers/gpu/drm/ttm/ttm_page_alloc.c|   26 +---
 drivers/gpu/stub/Kconfig|1 +
 28 files changed, 201 insertions(+), 125 deletions(-)

[PATCH] drm/i915: restore only the mode of this driver on lastclose (v2)

2011-04-13 Thread Chris Wilson

On Wed, 13 Apr 2011 09:35:55 +1000, Dave Airlie  wrote:
> From: Dave Airlie 
> 
> i915 calls the panic handler function on last close to reset the modes,
> however this is a really bad idea for multi-gpu machines, esp shareable
> gpus machines. So add a new entry point for the driver to just restore
> its own fbcon mode.
> 
> v2: move code into fb helper, fix panic code to block mode change on
> powered off GPUs.

2 bugs in one patch?  This could be split into 3 steps... ;-)

Aside from that, looks good.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

Linux 2.6.39-rc3

2011-04-13 Thread Ingo Molnar


* Joerg Roedel  wrote:

> > > The problem does not happen with 2.6.38. I try to bisect this further 
> > > down to a commit. Alex, please let me know if you need any further 
> > > information.
> > 
> > If you can bisect it, that would be great.  Thanks,
> 
> Bisecting actually gave a very weird result. It points to
> 
>   d2137d5af4259f50c19addb8246a186c9ffac325
> 
> which is a merge-commit in the x86 tree. Even more weird is that this
> notebook is the only machine with these symptoms, all my other boxes are
> fine.
>
> During the bisect I tested commits from Yinghai which were good. It seems 
> like the problem appeared with the merge.

There's a similar looking bug being debugged here:

  https://bugzilla.kernel.org/show_bug.cgi?id=33012

Could you please send the before/after bootlog (in particular all memory init 
messages included) and your .config?

 before:  f005fe12b90c: x86-64: Move out cleanup higmap [_brk_end, _end) out of 
init_memory_mapping()
  after:  d2137d5af425: Merge branch 'linus' into x86/bootmem

I've Cc:-ed more people who might have an idea about it.

Thanks,

Ingo

[Bug 34534] resolution 3840x1024 stopped to work on HD5850 after switch to 2.6.37 kernel

2011-04-13 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=34534

--- Comment #15 from Peter Hercek  2011-04-13 00:37:56 
PDT ---
Created an attachment (id=45562)
 --> (https://bugs.freedesktop.org/attachment.cgi?id=45562)
xrandr --verbose output on 2.6.38.2-vanilla (with 3840x1024 fixed using
radeonreg regset 0x770c 0x00020004)

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.

Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?

2011-04-13 Thread Gabriel Paubert

On Tue, Apr 12, 2011 at 07:29:22PM +0200, Michel D?nzer wrote:
> On Die, 2011-04-12 at 14:00 +0200, Gabriel Paubert wrote:
> > On Tue, Apr 12, 2011 at 01:46:10PM +0200, Michel D?nzer wrote:
> > > > 
> > > > With no_wb=1 the driver goes a bit further but the X server ends
> > > > up in an infinite ioctl loop and the logs are: 
> > > 
> > > Which ioctl does it loop on? Please provide the Xorg.0.log file as well.
> > 
> > From memory, the code was 0x64, which is DRM_RADEON_GEM_WAIT_IDLE.
> 
> Note that it's normal for this ioctl to be called every time before the
> GPU accessible pixmap memory is accessed by the CPU. Unless the ioctl
> always returns an error, this may not indicate a problem on its own. 

It seems to be an infinite loop, always returning EINTR because
of regular SIGALRM delivery.

> 
> 
> > The Xorg.0.log from the previous boot is attached.
> 
> I don't see any obvious problems in it. Can you describe the symptoms of
> the problem you're having with X a bit more?

Well, X is dead, or rather in an infinite ioctl loop as described  above.
IIRC, the display enters a power-down mode and there is nothing to see.

> 
> One thing I notice is that the X server/driver are rather oldish. Maybe
> you can try newer versions from testing, sid or even experimental to see
> if that makes any difference.

I lack time to do it until early May (being away for 2 weeks starting on 
Friday and busy on urgent things). I'm indeed Debian stable (Squeeze),
which is rather recent and the machine is about 2 1/2 years old.

Gabriel

Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?

2011-04-13 Thread Gabriel Paubert

On Tue, Apr 12, 2011 at 01:46:10PM +0200, Michel D?nzer wrote:
> BTW, if your kernel contains commit
> 69a07f0b117a40fcc1a479358d8e1f41793617f2, can you try if reverting that
> helps?

My kernel is pristine 2.6.38 and does not include this commit
(was introduced before 2.6.39-rc1 according to gitk).

Gabriel

[Bug 34534] resolution 3840x1024 stopped to work on HD5850 after switch to 2.6.37 kernel

2011-04-13 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=34534

--- Comment #16 from Peter Hercek  2011-04-13 01:37:03 
PDT ---
(In reply to comment #14)
> Does this patch help?

No, the image stays corrupted, I still need to do this to fix it:
# radeonreg regset 0x770c 0x00020004
OLD: 0x770c (770c)0x00010005 (65541)
NEW: 0x770c (770c)0x00010004 (65540)
#

I applied and tested the patch with 2.6.38.2-vanilla.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.

Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?

2011-04-13 Thread Benjamin Herrenschmidt

On Wed, 2011-04-13 at 09:59 +0200, Gabriel Paubert wrote:
> 
> Well, X is dead, or rather in an infinite ioctl loop as described
> above.
> IIRC, the display enters a power-down mode and there is nothing to
> see.

So basically the card crashed. There's about an infinite amount of
reasons why radeons do so, sometimes it has to do with them not liking
what you ate that day...

The only thing I can see that could be of use would be a bisect

Cheers,
Ben.

Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?

2011-04-13 Thread Gabriel Paubert

On Wed, Apr 13, 2011 at 06:16:13PM +1000, Benjamin Herrenschmidt wrote:
> On Wed, 2011-04-13 at 09:59 +0200, Gabriel Paubert wrote:
> > 
> > Well, X is dead, or rather in an infinite ioctl loop as described
> > above.
> > IIRC, the display enters a power-down mode and there is nothing to
> > see.
> 
> So basically the card crashed. There's about an infinite amount of
> reasons why radeons do so, sometimes it has to do with them not liking
> what you ate that day...
> 
> The only thing I can see that could be of use would be a bisect

Bisecting for something which I have never got to work (radeon with
KMS) on this machine is something I don't know how to do...

Note that radeon without KMS also always ends up crashing, but it
may take hours. The only case where the machine works reliably is 
when glxinfo claims that it is using software rendering.

Regards,
Gabriel

small git lesson [Was: Re: Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?]

2011-04-13 Thread Gabriel Paubert

On Wed, Apr 13, 2011 at 10:59:14AM +0200, Andreas Schwab wrote:
> Uwe Kleine-K?nig  writes:
> 
> > $ git name-rev --refs=refs/tags/v2.6\* 
> > 69a07f0b117a40fcc1a479358d8e1f41793617f2
> > 69a07f0b117a40fcc1a479358d8e1f41793617f2 tags/v2.6.39-rc2~3^2~43^2~4
> >
> > so it was introduced just before -rc2.
> 
> $ git tag --contains 69a07f0b117a40fcc1a479358d8e1f41793617f2
> v2.6.39-rc1
> v2.6.39-rc2
> 

So who is right? I think it was before rc1. 

Anyway I'm aware that there are other git commands, although for the option
details I often have to have a look at the man page.

However in this case the main reason to fire gitk was to have a quick look 
at the patch and its context, and simply reported the "Precedes" line 
in the display, which is 2.6.39-rc1. It also follow v2.6.37-rc2, which means
that it has been quite a long time outside the main tree.

Gabriel

[Bug 35502] Regression: black screen with Radeon KMS in 2.6.38 (2.6.37.4 worked fine)

2011-04-13 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=35502

Michel D?nzer  changed:

   What|Removed |Added

 CC||bryce at canonical.com

--- Comment #9 from Michel D?nzer  2011-04-13 04:45:42 
PDT ---
*** Bug 36007 has been marked as a duplicate of this bug. ***

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.

[PATCH] Big Endian support for RV730 (Mesa r600)

2011-04-13 Thread Benjamin Herrenschmidt

On Tue, 2011-04-12 at 10:01 +0200, C?dric Cano wrote:
> Hi
> 
> Here you are a patch that adds big endian support for rv730 in r600 
> classic mesa driver. The BE modifications are almost the same as the DRM 
> / DDX driver modifications 
> (http://lists.freedesktop.org/archives/dri-devel/2011-February/008151.html).
> 
> I used the mesa-demos to test the driver status on big endian platform. 
> Nearly all demos renders the same as on Intel architecture. 
> Nevertheless, there are still some issues in glReadPixels (r600_blit) 
> with some formats. I can't figure out exactly what and when data must be 
> swapped (set_tex_resoures, set_render_target...). Review of the patch 
> would be greatly appreciated.
> 
> It seems that r600g will be the default for Mesa 7.11 so I'll try to 
> enable big endian support for Gallium now.

Cool stuff !

I'll try to test that one of these days on various ppc's

Cheers,
Ben.

[PATCH] Big Endian support for RV730 (Mesa r600)

2011-04-13 Thread Benjamin Herrenschmidt

On Wed, 2011-04-13 at 22:05 +1000, Benjamin Herrenschmidt wrote:
> On Tue, 2011-04-12 at 10:01 +0200, C?dric Cano wrote:
> > Hi
> > 
> > Here you are a patch that adds big endian support for rv730 in r600 
> > classic mesa driver. The BE modifications are almost the same as the DRM 
> > / DDX driver modifications 
> > (http://lists.freedesktop.org/archives/dri-devel/2011-February/008151.html).
> > 
> > I used the mesa-demos to test the driver status on big endian platform. 
> > Nearly all demos renders the same as on Intel architecture. 
> > Nevertheless, there are still some issues in glReadPixels (r600_blit) 
> > with some formats. I can't figure out exactly what and when data must be 
> > swapped (set_tex_resoures, set_render_target...). Review of the patch 
> > would be greatly appreciated.
> > 
> > It seems that r600g will be the default for Mesa 7.11 so I'll try to 
> > enable big endian support for Gallium now.
> 
> Cool stuff !
> 
> I'll try to test that one of these days on various ppc's

BTW. I see you used some FSL embedded board. Do you have your PCIe MMIO
space above 32-bit ? Last I looked, there was a bunch of fixing needing
to be done, among others in the TTM, to make that work.

I had some preliminary patches but they bitrot... mostly the issue is to
make sure than a phys_addr_t is used instead of an unsigned long
whenever it tries to store the physical address of an object.

Ben.

Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?

2011-04-13 Thread Michel Dänzer

On Mit, 2011-04-13 at 09:59 +0200, Gabriel Paubert wrote: 
> On Tue, Apr 12, 2011 at 07:29:22PM +0200, Michel D?nzer wrote:
> > On Die, 2011-04-12 at 14:00 +0200, Gabriel Paubert wrote:
> > > On Tue, Apr 12, 2011 at 01:46:10PM +0200, Michel D?nzer wrote:
> > > > > 
> > > > > With no_wb=1 the driver goes a bit further but the X server ends
> > > > > up in an infinite ioctl loop and the logs are: 
> > > > 
> > > > Which ioctl does it loop on? Please provide the Xorg.0.log file as well.
> > > 
> > > From memory, the code was 0x64, which is DRM_RADEON_GEM_WAIT_IDLE.
> > 
> > Note that it's normal for this ioctl to be called every time before the
> > GPU accessible pixmap memory is accessed by the CPU. Unless the ioctl
> > always returns an error, this may not indicate a problem on its own. 
> 
> It seems to be an infinite loop, always returning EINTR because
> of regular SIGALRM delivery.

That does sound like the GPU locks up. Do you get any messages in dmesg
about lockups and attempts to reset the GPU at any time?


-- 
Earthling Michel D?nzer   |http://www.vmware.com
Libre software enthusiast |  Debian, X and DRI developer

Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?

2011-04-13 Thread Gabriel Paubert

On Wed, Apr 13, 2011 at 02:12:16PM +0200, Michel D?nzer wrote:
> On Mit, 2011-04-13 at 09:59 +0200, Gabriel Paubert wrote: 
> > On Tue, Apr 12, 2011 at 07:29:22PM +0200, Michel D?nzer wrote:
> > > On Die, 2011-04-12 at 14:00 +0200, Gabriel Paubert wrote:
> > > > On Tue, Apr 12, 2011 at 01:46:10PM +0200, Michel D?nzer wrote:
> > > > > > 
> > > > > > With no_wb=1 the driver goes a bit further but the X server ends
> > > > > > up in an infinite ioctl loop and the logs are: 
> > > > > 
> > > > > Which ioctl does it loop on? Please provide the Xorg.0.log file as 
> > > > > well.
> > > > 
> > > > From memory, the code was 0x64, which is DRM_RADEON_GEM_WAIT_IDLE.
> > > 
> > > Note that it's normal for this ioctl to be called every time before the
> > > GPU accessible pixmap memory is accessed by the CPU. Unless the ioctl
> > > always returns an error, this may not indicate a problem on its own. 
> > 
> > It seems to be an infinite loop, always returning EINTR because
> > of regular SIGALRM delivery.
> 
> That does sound like the GPU locks up. Do you get any messages in dmesg
> about lockups and attempts to reset the GPU at any time?

No.

Gabriel

Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?

2011-04-13 Thread Michel Dänzer

On Mit, 2011-04-13 at 14:27 +0200, Gabriel Paubert wrote: 
> On Wed, Apr 13, 2011 at 02:12:16PM +0200, Michel D?nzer wrote:
> > On Mit, 2011-04-13 at 09:59 +0200, Gabriel Paubert wrote: 
> > > On Tue, Apr 12, 2011 at 07:29:22PM +0200, Michel D?nzer wrote:
> > > > On Die, 2011-04-12 at 14:00 +0200, Gabriel Paubert wrote:
> > > > > On Tue, Apr 12, 2011 at 01:46:10PM +0200, Michel D?nzer wrote:
> > > > > > > 
> > > > > > > With no_wb=1 the driver goes a bit further but the X server ends
> > > > > > > up in an infinite ioctl loop and the logs are: 
> > > > > > 
> > > > > > Which ioctl does it loop on? Please provide the Xorg.0.log file as 
> > > > > > well.
> > > > > 
> > > > > From memory, the code was 0x64, which is DRM_RADEON_GEM_WAIT_IDLE.
> > > > 
> > > > Note that it's normal for this ioctl to be called every time before the
> > > > GPU accessible pixmap memory is accessed by the CPU. Unless the ioctl
> > > > always returns an error, this may not indicate a problem on its own. 
> > > 
> > > It seems to be an infinite loop, always returning EINTR because
> > > of regular SIGALRM delivery.
> > 
> > That does sound like the GPU locks up. Do you get any messages in dmesg
> > about lockups and attempts to reset the GPU at any time?
> 
> No.

Hmm, I guess the constant SIGALRMs might prevent the lockup detection
from kicking in... Maybe you can try starting the X server with
-dumbSched to see if that gets things along any further, but in the end
there's probably no way around figuring out what causes the lockup and
fixing that anyway.


-- 
Earthling Michel D?nzer   |http://www.vmware.com
Libre software enthusiast |  Debian, X and DRI developer

[PATCH] drm/radeon/kms: fix suspend on rv530 asics

2011-04-13 Thread Jerome Glisse

On Tue, Apr 12, 2011 at 1:33 PM, Alex Deucher  wrote:
> Apparently only rv515 asics need the workaround
> added in f24d86f1a49505cdea56728b853a5d0a3f8e3d11
> (drm/radeon/kms: fix resume regression for some r5xx laptops).
>
> Fixes:
> https://bugs.freedesktop.org/show_bug.cgi?id=34709
>
> Signed-off-by: Alex Deucher 
> Cc: stable at kernel.org
> ---
> ?drivers/gpu/drm/radeon/atom.c | ? ?6 +-
> ?1 files changed, 5 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/gpu/drm/radeon/atom.c b/drivers/gpu/drm/radeon/atom.c
> index 258fa5e..d71d375 100644
> --- a/drivers/gpu/drm/radeon/atom.c
> +++ b/drivers/gpu/drm/radeon/atom.c
> @@ -32,6 +32,7 @@
> ?#include "atom.h"
> ?#include "atom-names.h"
> ?#include "atom-bits.h"
> +#include "radeon.h"
>
> ?#define ATOM_COND_ABOVE ? ? ? ? ? ? ? ?0
> ?#define ATOM_COND_ABOVEOREQUAL 1
> @@ -101,7 +102,9 @@ static void debug_print_spaces(int n)
> ?static uint32_t atom_iio_execute(struct atom_context *ctx, int base,
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? uint32_t index, uint32_t data)
> ?{
> + ? ? ? struct radeon_device *rdev = ctx->card->dev->dev_private;
> ? ? ? ?uint32_t temp = 0xCDCDCDCD;
> +
> ? ? ? ?while (1)
> ? ? ? ? ? ? ? ?switch (CU8(base)) {
> ? ? ? ? ? ? ? ?case ATOM_IIO_NOP:
> @@ -112,7 +115,8 @@ static uint32_t atom_iio_execute(struct atom_context 
> *ctx, int base,
> ? ? ? ? ? ? ? ? ? ? ? ?base += 3;
> ? ? ? ? ? ? ? ? ? ? ? ?break;
> ? ? ? ? ? ? ? ?case ATOM_IIO_WRITE:
> - ? ? ? ? ? ? ? ? ? ? ? (void)ctx->card->ioreg_read(ctx->card, CU16(base + 
> 1));
> + ? ? ? ? ? ? ? ? ? ? ? if (rdev->family == CHIP_RV515)
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? (void)ctx->card->ioreg_read(ctx->card, 
> CU16(base + 1));
> ? ? ? ? ? ? ? ? ? ? ? ?ctx->card->ioreg_write(ctx->card, CU16(base + 1), 
> temp);
> ? ? ? ? ? ? ? ? ? ? ? ?base += 3;
> ? ? ? ? ? ? ? ? ? ? ? ?break;
> --
> 1.7.1.1
>


So this patch enable io write only for one family ? This looks utterly strange.

Cheers,
Jerome

[PATCH] drm/radeon/kms: fix suspend on rv530 asics

2011-04-13 Thread Alex Deucher

On Wed, Apr 13, 2011 at 10:46 AM, Jerome Glisse  wrote:
> On Tue, Apr 12, 2011 at 1:33 PM, Alex Deucher  
> wrote:
>> Apparently only rv515 asics need the workaround
>> added in f24d86f1a49505cdea56728b853a5d0a3f8e3d11
>> (drm/radeon/kms: fix resume regression for some r5xx laptops).
>>
>> Fixes:
>> https://bugs.freedesktop.org/show_bug.cgi?id=34709
>>
>> Signed-off-by: Alex Deucher 
>> Cc: stable at kernel.org
>> ---
>> ?drivers/gpu/drm/radeon/atom.c | ? ?6 +-
>> ?1 files changed, 5 insertions(+), 1 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/radeon/atom.c b/drivers/gpu/drm/radeon/atom.c
>> index 258fa5e..d71d375 100644
>> --- a/drivers/gpu/drm/radeon/atom.c
>> +++ b/drivers/gpu/drm/radeon/atom.c
>> @@ -32,6 +32,7 @@
>> ?#include "atom.h"
>> ?#include "atom-names.h"
>> ?#include "atom-bits.h"
>> +#include "radeon.h"
>>
>> ?#define ATOM_COND_ABOVE ? ? ? ? ? ? ? ?0
>> ?#define ATOM_COND_ABOVEOREQUAL 1
>> @@ -101,7 +102,9 @@ static void debug_print_spaces(int n)
>> ?static uint32_t atom_iio_execute(struct atom_context *ctx, int base,
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? uint32_t index, uint32_t data)
>> ?{
>> + ? ? ? struct radeon_device *rdev = ctx->card->dev->dev_private;
>> ? ? ? ?uint32_t temp = 0xCDCDCDCD;
>> +
>> ? ? ? ?while (1)
>> ? ? ? ? ? ? ? ?switch (CU8(base)) {
>> ? ? ? ? ? ? ? ?case ATOM_IIO_NOP:
>> @@ -112,7 +115,8 @@ static uint32_t atom_iio_execute(struct atom_context 
>> *ctx, int base,
>> ? ? ? ? ? ? ? ? ? ? ? ?base += 3;
>> ? ? ? ? ? ? ? ? ? ? ? ?break;
>> ? ? ? ? ? ? ? ?case ATOM_IIO_WRITE:
>> - ? ? ? ? ? ? ? ? ? ? ? (void)ctx->card->ioreg_read(ctx->card, CU16(base + 
>> 1));
>> + ? ? ? ? ? ? ? ? ? ? ? if (rdev->family == CHIP_RV515)
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? (void)ctx->card->ioreg_read(ctx->card, 
>> CU16(base + 1));
>> ? ? ? ? ? ? ? ? ? ? ? ?ctx->card->ioreg_write(ctx->card, CU16(base + 1), 
>> temp);
>> ? ? ? ? ? ? ? ? ? ? ? ?base += 3;
>> ? ? ? ? ? ? ? ? ? ? ? ?break;
>> --
>> 1.7.1.1
>>
>
>
> So this patch enable io write only for one family ? This looks utterly 
> strange.

No, it just does a read before write for rv515.  I don't know why it
needs it, but it seems to.

Alex

>
> Cheers,
> Jerome
>

[Bug 25588] Lots of ARB_vertex_program/fragment_program parser errors in ETQW (if GLSL is unavailable)

2011-04-13 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=25588

Fabio Pedretti  changed:

   What|Removed |Added

 Resolution|WORKSFORME  |WONTFIX
  Component|Mesa core   |Drivers/DRI/r300
 AssignedTo|mesa-dev at lists.freedesktop. |dri-devel at 
lists.freedesktop
   |org |.org

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.

[Bug 33222] New: [RADEON] Oops in worker thread for radeon_unpin_work_func

2011-04-13 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=33222

   Summary: [RADEON] Oops in worker thread for
radeon_unpin_work_func
   Product: Drivers
   Version: 2.5
Kernel Version: 2.6.38.2
  Platform: All
OS/Version: Linux
  Tree: Mainline
Status: NEW
  Severity: low
  Priority: P1
 Component: Video(DRI - non Intel)
AssignedTo: drivers_video-dri at kernel-bugs.osdl.org
ReportedBy: thomas at m3y3r.de
Regression: No


Created an attachment (id=54282)
 --> (https://bugzilla.kernel.org/attachment.cgi?id=54282)
Oops - Part 1

Few days ago I stumbled upon the attached oops. Just images. sorry for that.
This is the first time I saw this oops. I just hit it once for 2.6.38.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.

--
Forrester Wave Report - Recovery time is now measured in hours and minutes
not days. Key insights are discussed in the 2010 Forrester Wave Report as
part of an in-depth evaluation of disaster recovery service providers.
Forrester found the best-in-class provider in terms of services and vision.
Read this report now!  http://p.sf.net/sfu/ibm-webcastpromo
--
___
Dri-devel mailing list
Dri-devel at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

[Bug 33222] [RADEON] Oops in worker thread for radeon_unpin_work_func

2011-04-13 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=33222





--- Comment #1 from Thomas Meyer   2011-04-13 17:09:48 ---
Created an attachment (id=54292)
 --> (https://bugzilla.kernel.org/attachment.cgi?id=54292)
Oops - Part 2

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.

--
Forrester Wave Report - Recovery time is now measured in hours and minutes
not days. Key insights are discussed in the 2010 Forrester Wave Report as
part of an in-depth evaluation of disaster recovery service providers.
Forrester found the best-in-class provider in terms of services and vision.
Read this report now!  http://p.sf.net/sfu/ibm-webcastpromo
--
___
Dri-devel mailing list
Dri-devel at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

small git lesson [Was: Re: Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?]

2011-04-13 Thread Uwe Kleine-König

On Wed, Apr 13, 2011 at 10:02:04AM +0200, Gabriel Paubert wrote:
> On Tue, Apr 12, 2011 at 01:46:10PM +0200, Michel D?nzer wrote:
> > BTW, if your kernel contains commit
> > 69a07f0b117a40fcc1a479358d8e1f41793617f2, can you try if reverting that
> > helps?
> 
> My kernel is pristine 2.6.38 and does not include this commit
> (was introduced before 2.6.39-rc1 according to gitk).
gitk is not the best tool to find this out.

$ git name-rev --refs=refs/tags/v2.6\* 69a07f0b117a40fcc1a479358d8e1f41793617f2
69a07f0b117a40fcc1a479358d8e1f41793617f2 tags/v2.6.39-rc2~3^2~43^2~4

so it was introduced just before -rc2.

Best regards
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-K?nig|
Industrial Linux Solutions | http://www.pengutronix.de/  |

small git lesson [Was: Re: Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?]

2011-04-13 Thread Andreas Schwab

Uwe Kleine-K?nig  writes:

> $ git name-rev --refs=refs/tags/v2.6\* 
> 69a07f0b117a40fcc1a479358d8e1f41793617f2
> 69a07f0b117a40fcc1a479358d8e1f41793617f2 tags/v2.6.39-rc2~3^2~43^2~4
>
> so it was introduced just before -rc2.

$ git tag --contains 69a07f0b117a40fcc1a479358d8e1f41793617f2
v2.6.39-rc1
v2.6.39-rc2

Andreas.

-- 
Andreas Schwab, schwab at redhat.com
GPG Key fingerprint = D4E8 DBE3 3813 BB5D FA84  5EC7 45C6 250E 6F00 984E
"And now for something completely different."

small git lesson [Was: Re: Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?]

2011-04-13 Thread Uwe Kleine-König

Hello Gabriel
On Wed, Apr 13, 2011 at 12:31:44PM +0200, Gabriel Paubert wrote:
> On Wed, Apr 13, 2011 at 10:59:14AM +0200, Andreas Schwab wrote:
> > Uwe Kleine-K?nig  writes:
> > 
> > > $ git name-rev --refs=refs/tags/v2.6\* 
> > > 69a07f0b117a40fcc1a479358d8e1f41793617f2
> > > 69a07f0b117a40fcc1a479358d8e1f41793617f2 tags/v2.6.39-rc2~3^2~43^2~4
> > >
> > > so it was introduced just before -rc2.
> > 
> > $ git tag --contains 69a07f0b117a40fcc1a479358d8e1f41793617f2
> > v2.6.39-rc1
> > v2.6.39-rc2
> > 
> 
> So who is right? I think it was before rc1. 
Yep, correct. I interpreted the output of git name-rev to mean it's not
included in a tag earlier than v2.6.39-rc2, but actually that's wrong.
It's just that it's easier (for some definition of easy) to reach the
commit in question from v2.6.39-rc2 than from v2.6.39-rc1.

> However in this case the main reason to fire gitk was to have a quick look 
> at the patch and its context, and simply reported the "Precedes" line 
> in the display, which is 2.6.39-rc1. It also follow v2.6.37-rc2, which means
> that it has been quite a long time outside the main tree.
I think this conclusion isn't valid in general. (E.g. in git itself a
bug-fix is often done on top of the commit that introduced it and than
merged into master. Still the bugfix might be new.) But looking at the
AuthorDate of 69a07f0b117a seems to support your statement.

Best regards
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-K?nig|
Industrial Linux Solutions | http://www.pengutronix.de/  |

[Bug 33222] [RADEON] Oops in worker thread for radeon_unpin_work_func

2011-04-13 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=33222


Alex Deucher  changed:

   What|Removed |Added

 CC||alexdeucher at gmail.com




--- Comment #2 from Alex Deucher   2011-04-13 
17:19:06 ---
This is a duplicate of bug 32402.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.

--
Forrester Wave Report - Recovery time is now measured in hours and minutes
not days. Key insights are discussed in the 2010 Forrester Wave Report as
part of an in-depth evaluation of disaster recovery service providers.
Forrester found the best-in-class provider in terms of services and vision.
Read this report now!  http://p.sf.net/sfu/ibm-webcastpromo
--
___
Dri-devel mailing list
Dri-devel at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Linux 2.6.39-rc3

2011-04-13 Thread Joerg Roedel

On Wed, Apr 13, 2011 at 08:46:09AM +0200, Ingo Molnar wrote:
> Could you please send the before/after bootlog (in particular all memory init 
> messages included) and your .config?
> 
>  before:  f005fe12b90c: x86-64: Move out cleanup higmap [_brk_end, _end) out 
> of init_memory_mapping()
>   after:  d2137d5af425: Merge branch 'linus' into x86/bootmem
> 
> I've Cc:-ed more people who might have an idea about it.

Okay, I have done some more bisecting and debugging today.

First of all, I bisected between v2.6.37-rc2..f005fe12b90c which where
only a couple of patches and merged v2.6.38-rc4 in at every step. There
was no failure found.
Then I tried this again, but this time I merged v2.6.38-rc5 at every
step and was successful. The bad commit in this branch turned out to be

1a4a678b12c84db9ae5dce424e0e97f0559bb57c

which is related to memblock.

Then I tried to find out which change between 2.6.38-rc4 and 2.6.38-rc5
is needed to trigger the failure, so I used f005fe12b90c as a base,
bisected between v2.6.38-rc4..v2.6.38-rc5 and merged every bisect step
into the base and tested. Here the bad commit turned out to be

e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20

which is related to gart. It turned out that the gart aperture on that
box is on another position with these patches. Before it was as
0xa400 and now it is at 0xa000. It seems like this has something
to do with the root-cause.

Reverting commit 1a4a678b12c84db9ae5dce424e0e97f0559bb57c fixes the
problem btw. and booting with iommu=soft also works, but I have no idea
yet why the aperture at that address is a problem (with the patch
reverted the aperture lands at 0x8000).

I have put some debug-data online. There is my .config and two
dmesg-files for good (==2.6.39-rc3 + revert) and bad (==2.6.39-rc3)
I also created these dmesg-files again with memblock=debug, maybe that
helps to find the problem. The files are at

http://www.8bytes.org/~joro/debug/

Or someone else has an idea about the issue...

Joerg

[Bug 30651] [RADEON:KMS:R600G] gl output in mplayer have no colors if used with a fragment program with additional lookup and bicubic B-spline filtering

2011-04-13 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=30651

--- Comment #7 from Andy Furniss  2011-04-13 
10:23:46 PDT ---
(In reply to comment #6)

> 1) yuv=4 on r600g still have no colours even though with r300g they are ok

yuv=4 with or without bicubic now works for me on 600g

> 2) still there is an overbright glitch in some white places in some videos 
> with
> yuv=6. but again, it may be a mplayer bug since it present with r300g too (but
> not software rasterizer), i'm not sure.

This is still the same.

One general observation is the with 600g perf is poor compared to 600c or xv,
which are at least 2x faster when benchmarking with HD streams.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.

[Bug 30651] [RADEON:KMS:R600G] gl output in mplayer have no colors if used with a fragment program with additional lookup and bicubic B-spline filtering

2011-04-13 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=30651

--- Comment #8 from Sergey Kondakov  2011-04-13 
10:46:29 PDT ---
same here.
and i never got answer about which method is better with amd/ati card and open
stack now. i hope devs are looking into that stuff.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.

[Bug 30651] [RADEON:KMS:R600G] gl output in mplayer have no colors if used with a fragment program with additional lookup and bicubic B-spline filtering

2011-04-13 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=30651

--- Comment #9 from Andy Furniss  2011-04-13 
11:38:47 PDT ---
(In reply to comment #8)
> same here.
> and i never got answer about which method is better with amd/ati card and open
> stack now. i hope devs are looking into that stuff.

Maybe there isn't an answer as such for that question.

I guess someone with an on-board low spec GPU may be more limited than a high
end card with fast vram.

Quality wise - I can't see any difference, the higher yuv= numbers give more
features like gamma correction (not sure how to use it though).

It would be nice if 600g could beat or equal 600c - it does for 3D, but for
some reason not this.

I said classic was twice as fast - it's actually more than that if I discount
time taken by the codec.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.

[Bug 32982] Kernel locks up a few minutes after boot

2011-04-13 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=32982





--- Comment #6 from Bart Van Assche   2011-04-13 
18:49:13 ---
Although I'm still busy bisecting, I'd like to report that I got the following
hung task report with head b73a21fc66fee35b41db755abebfacba48b2fc76 (had
already seen something similar before with 2.6.39-rc2):

INFO: task kjournald:918 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kjournald   D 880131b9ddb8 0   918  2 0x
 880131b9dd20 0046 880131b9dca0 8108cd6d
 0282 880131b9dfd8 880137729f40 880131b9dfd8
 880131b9c000 880131b9c000 880131b9c000 880131b9dfd8
Call Trace:
 [] ? trace_hardirqs_on_caller+0x14d/0x190
 [] ? sub_preempt_count+0xa9/0xe0
 [] journal_commit_transaction+0x13e/0x1590 [jbd]
 [] ? _raw_spin_unlock_irqrestore+0x65/0x80
 [] ? sub_preempt_count+0xa9/0xe0
 [] ? wake_up_bit+0x40/0x40
 [] ? del_timer_sync+0x8a/0xc0
 [] ? try_to_del_timer_sync+0x110/0x110
 [] kjournald+0xf1/0x250 [jbd]
 [] ? wake_up_bit+0x40/0x40
 [] ? commit_timeout+0x10/0x10 [jbd]
 [] kthread+0x96/0xa0
 [] kernel_thread_helper+0x4/0x10
 [] ? finish_task_switch+0x7b/0xe0
 [] ? _raw_spin_unlock_irq+0x3b/0x60
 [] ? retint_restore_args+0xe/0xe
 [] ? __init_kthread_worker+0x70/0x70
 [] ? gs_change+0xb/0xb
no locks held by kjournald/918.
INFO: task klauncher:5744 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
klauncher   D 0001000297b4 0  5744   5743 0x
 88011dd73938 0046 8801 8108cbef
 813e2535 88011dd73fd8 8801382e1f40 88011dd73fd8
 88011dd72000 88011dd72000 88011dd72000 88011dd73fd8
Call Trace:
 [] ? mark_held_locks+0x6f/0xa0
 [] ? _raw_spin_unlock_irqrestore+0x65/0x80
 [] ? __wait_on_buffer+0x30/0x30
 [] io_schedule+0x59/0x80
 [] sleep_on_buffer+0xe/0x20
 [] __wait_on_bit_lock+0x5a/0xc0
 [] ? __wait_on_buffer+0x30/0x30
 [] out_of_line_wait_on_bit_lock+0x78/0x90
 [] ? autoremove_wake_function+0x50/0x50
 [] __lock_buffer+0x36/0x40
 [] do_get_write_access+0x64d/0x660 [jbd]
 [] ? sub_preempt_count+0xa9/0xe0
 [] ? start_this_handle+0x370/0x470 [jbd]
 [] ? journal_add_journal_head+0xf4/0x220 [jbd]
 [] journal_get_write_access+0x31/0x50 [jbd]
 [] __ext3_journal_get_write_access+0x2d/0x60 [ext3]
 [] ext3_reserve_inode_write+0x83/0xb0 [ext3]
 [] ext3_mark_inode_dirty+0x44/0x70 [ext3]
 [] ext3_dirty_inode+0x5e/0xa0 [ext3]
 [] __mark_inode_dirty+0x3f/0x250
 [] file_update_time+0xec/0x170
 [] ? mutex_lock_nested+0x27d/0x3a0
 [] __generic_file_aio_write+0x1f8/0x440
 [] generic_file_aio_write+0x75/0xf0
 [] do_sync_write+0xda/0x120
 [] ? remove_vma+0x77/0x90
 [] ? trace_hardirqs_on+0xd/0x10
 [] ? remove_vma+0x77/0x90
 [] vfs_write+0xc6/0x170
 [] sys_write+0x51/0x90
 [] system_call_fastpath+0x16/0x1b
2 locks held by klauncher/5744:
 #0:  (&sb->s_type->i_mutex_key#11){+.+.+.}, at: []
generic_file_aio_write+0x59/0xf0
 #1:  (jbd_handle){+.+...}, at: []
start_this_handle+0x370/0x470 [jbd]
INFO: task okular:4180 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
okular  D 00010002a251 0  4180   5743 0x
 880041d13aa8 0046 8800 8108cd6d
 0282 880041d13fd8 880037a59f40 880041d13fd8
 880041d12000 880041d12000 880041d12000 880041d13fd8
Call Trace:
 [] ? trace_hardirqs_on_caller+0x14d/0x190
 [] start_this_handle+0x244/0x470 [jbd]
 [] ? is_module_address+0x33/0x60
 [] ? wake_up_bit+0x40/0x40
 [] journal_start+0xdb/0x120 [jbd]
 [] ext3_journal_start_sb+0x36/0x70 [ext3]
 [] ext3_setattr+0x1a3/0x210 [ext3]
 [] notify_change+0x116/0x360
 [] do_truncate+0x63/0x90
 [] ? sub_preempt_count+0xa9/0xe0
 [] do_last+0x42c/0x820
 [] path_openat+0xd0/0x410
 [] ? might_fault+0x53/0xb0
 [] do_filp_open+0x7f/0xa0
 [] ? sub_preempt_count+0xa9/0xe0
 [] ? _raw_spin_unlock+0x35/0x60
 [] ? alloc_fd+0xf4/0x150
 [] do_sys_open+0x101/0x1e0
 [] sys_open+0x20/0x30
 [] system_call_fastpath+0x16/0x1b
2 locks held by okular/4180:
 #0:  (&sb->s_type->i_mutex_key#11){+.+.+.}, at: []
do_truncate+0x57/0x90
 #1:  (&sb->s_type->i_alloc_sem_key#4){+.+...}, at: []
notify_change+0x2a0/0x360

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.

--
Forrester Wave Report - Recovery time is now measured in hours and minutes
not days. Key insights are discussed in the 2010 Forrester Wave Report as
part of an in-depth evaluation of disaster recovery service providers.
Forrester found the best-in-class provider in terms of services and vision.
Read this report now!  http://p.sf.net/sfu/ibm-webcastpromo
--
__

Linux 2.6.39-rc3

2011-04-13 Thread Yinghai Lu

On 04/13/2011 10:21 AM, Joerg Roedel wrote:
> On Wed, Apr 13, 2011 at 08:46:09AM +0200, Ingo Molnar wrote:
> First of all, I bisected between v2.6.37-rc2..f005fe12b90c which where
> only a couple of patches and merged v2.6.38-rc4 in at every step. There
> was no failure found.
> Then I tried this again, but this time I merged v2.6.38-rc5 at every
> step and was successful. The bad commit in this branch turned out to be
> 
>   1a4a678b12c84db9ae5dce424e0e97f0559bb57c
> 
> which is related to memblock.
> 
> Then I tried to find out which change between 2.6.38-rc4 and 2.6.38-rc5
> is needed to trigger the failure, so I used f005fe12b90c as a base,
> bisected between v2.6.38-rc4..v2.6.38-rc5 and merged every bisect step
> into the base and tested. Here the bad commit turned out to be
> 
>   e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20
> 
> which is related to gart. It turned out that the gart aperture on that
> box is on another position with these patches. Before it was as
> 0xa400 and now it is at 0xa000. It seems like this has something
> to do with the root-cause.
> 
> Reverting commit 1a4a678b12c84db9ae5dce424e0e97f0559bb57c fixes the
> problem btw. and booting with iommu=soft also works, but I have no idea
> yet why the aperture at that address is a problem (with the patch
> reverted the aperture lands at 0x8000).
> 
> I have put some debug-data online. There is my .config and two
> dmesg-files for good (==2.6.39-rc3 + revert) and bad (==2.6.39-rc3)
> I also created these dmesg-files again with memblock=debug, maybe that
> helps to find the problem. The files are at
> 
>   http://www.8bytes.org/~joro/debug/

thanks for the bisecting...

so those two patches uncover some problems.

[0.00] Checking aperture...
[0.00] No AGP bridge found
[0.00] Node 0: aperture @ a000 size 32 MB
[0.00] Aperture pointing to e820 RAM. Ignoring.
[0.00] Your BIOS doesn't leave a aperture memory hole
[0.00] Please enable the IOMMU option in the BIOS setup
[0.00] This costs you 64 MB of RAM
[0.00] memblock_x86_reserve_range: [0xa000-0xa3ff]   
aperture64
[0.00] Mapping aperture over 65536 KB of RAM @ a000

so kernel try to reallocate apperture. because BIOS allocated is pointed to RAM 
or size is too small.

but your radeon does use [0xa000, 0xbfff)

[4.281993] radeon :01:05.0: VRAM: 320M 0xC000 - 
0xD3FF (320M used)
[4.290672] radeon :01:05.0: GTT: 512M 0xA000 - 
0xBFFF
[4.298550] [drm] Detected VRAM RAM=320M, BAR=256M
[4.309857] [drm] RAM width 32bits DDR
[4.313748] [TTM] Zone  kernel: Available graphics memory: 1896524 kiB.
[4.320379] [TTM] Initializing pool allocator.
[4.324948] [drm] radeon: 320M of VRAM memory ready
[4.329832] [drm] radeon: 512M of GTT memory ready.

and the one seems working:

[0.00] Checking aperture...
[0.00] No AGP bridge found
[0.00] Node 0: aperture @ a000 size 32 MB
[0.00] Aperture pointing to e820 RAM. Ignoring.
[0.00] Your BIOS doesn't leave a aperture memory hole
[0.00] Please enable the IOMMU option in the BIOS setup
[0.00] This costs you 64 MB of RAM
[0.00] memblock_x86_reserve_range: [0x8000-0x83ff]   
aperture64
[0.00] Mapping aperture over 65536 KB of RAM @ 8000
[0.00] memblock_x86_reserve_range: [0xacb6bdc0-0xacb6bddf]  
BOOTMEM

will use different position...

[4.250159] radeon :01:05.0: VRAM: 320M 0xC000 - 
0xD3FF (320M used)
[4.258830] radeon :01:05.0: GTT: 512M 0xA000 - 
0xBFFF
[4.266742] [drm] Detected VRAM RAM=320M, BAR=256M
[4.271549] [drm] RAM width 32bits DDR
[4.275435] [TTM] Zone  kernel: Available graphics memory: 1896526 kiB.
[4.282066] [TTM] Initializing pool allocator.
[4.282085] usb 7-2: new full speed USB device number 2 using ohci_hcd
[4.293076] [drm] radeon: 320M of VRAM memory ready
[4.298277] [drm] radeon: 512M of GTT memory ready.
[4.303218] [drm] Supports vblank timestamp caching Rev 1 (10.10.2010).
[4.309854] [drm] Driver supports precise vblank timestamp query.
[4.315970] [drm] radeon: irq initialized.
[4.320094] [drm] GART: num cpu pages 131072, num gpu pages 131072

So question is why radeon is using the address [0xa000 - 0xc00], and in 
E820 it is RAM 

[0.00]  BIOS-e820: 0010 - acb8d000 (usable)
[0.00]  BIOS-e820: acb8d000 - acb8f000 (reserved)
[0.00]  BIOS-e820: acb8f000 - afce9000 (usable)
[0.00]  BIOS-e820: afce9000 - afd21000 (reserved)
[0.00]  BIOS-e820: afd21000 - afd4f000 (usable)
[0.00]  BIOS-e820: afd4f000 - afdcf000 (reserved)
[0.00]  BIOS-e820: afdcf000

Linux 2.6.39-rc3

2011-04-13 Thread H. Peter Anvin

On 04/13/2011 10:21 AM, Joerg Roedel wrote:
> 
> First of all, I bisected between v2.6.37-rc2..f005fe12b90c which where
> only a couple of patches and merged v2.6.38-rc4 in at every step. There
> was no failure found.
> Then I tried this again, but this time I merged v2.6.38-rc5 at every
> step and was successful. The bad commit in this branch turned out to be
> 
>   1a4a678b12c84db9ae5dce424e0e97f0559bb57c
> 
> which is related to memblock.
> 
> Then I tried to find out which change between 2.6.38-rc4 and 2.6.38-rc5
> is needed to trigger the failure, so I used f005fe12b90c as a base,
> bisected between v2.6.38-rc4..v2.6.38-rc5 and merged every bisect step
> into the base and tested. Here the bad commit turned out to be
> 
>   e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20
> 
> which is related to gart. It turned out that the gart aperture on that
> box is on another position with these patches. Before it was as
> 0xa400 and now it is at 0xa000. It seems like this has something
> to do with the root-cause.
> 
> Reverting commit 1a4a678b12c84db9ae5dce424e0e97f0559bb57c fixes the
> problem btw. and booting with iommu=soft also works, but I have no idea
> yet why the aperture at that address is a problem (with the patch
> reverted the aperture lands at 0x8000).
> 

Does reverting e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20 solve the
problem for you?

1a4a678b12c84db9ae5dce424e0e97f0559bb57c is a memory-allocation-order
patch, which have a nasty tendency to unmask bugs elsewhere in the
kernel.  However, e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20 looks
positively strange (and it doesn't exactly help that the description is
written in Yinghai-ese and is therefore nearly impossible to decode,
never mind tell if it is remotely correct.)

-hpa

Linux 2.6.39-rc3

2011-04-13 Thread H. Peter Anvin

On 04/13/2011 10:21 AM, Joerg Roedel wrote:
> On Wed, Apr 13, 2011 at 08:46:09AM +0200, Ingo Molnar wrote:
>> Could you please send the before/after bootlog (in particular all memory 
>> init 
>> messages included) and your .config?
>>
>>  before:  f005fe12b90c: x86-64: Move out cleanup higmap [_brk_end, _end) out 
>> of init_memory_mapping()
>>   after:  d2137d5af425: Merge branch 'linus' into x86/bootmem
>>
>> I've Cc:-ed more people who might have an idea about it.
> 
> Okay, I have done some more bisecting and debugging today.
> 

First of all, *huge* thanks for this effort.  At least we need to track
down the bits that need to be reverted -- it is past rc3, and it's time
to see what we should revert and tell the submitter to try again next cycle.

This looks to be the same issue as in bugzilla 33012:

https://bugzilla.kernel.org/show_bug.cgi?id=33012

... so it would be good if we could keep the information in there.

-hpa

Linux 2.6.39-rc3

2011-04-13 Thread Joerg Roedel

On Wed, Apr 13, 2011 at 11:51:39AM -0700, H. Peter Anvin wrote:
> On 04/13/2011 10:21 AM, Joerg Roedel wrote:
> > 
> > First of all, I bisected between v2.6.37-rc2..f005fe12b90c which where
> > only a couple of patches and merged v2.6.38-rc4 in at every step. There
> > was no failure found.
> > Then I tried this again, but this time I merged v2.6.38-rc5 at every
> > step and was successful. The bad commit in this branch turned out to be
> > 
> > 1a4a678b12c84db9ae5dce424e0e97f0559bb57c
> > 
> > which is related to memblock.
> > 
> > Then I tried to find out which change between 2.6.38-rc4 and 2.6.38-rc5
> > is needed to trigger the failure, so I used f005fe12b90c as a base,
> > bisected between v2.6.38-rc4..v2.6.38-rc5 and merged every bisect step
> > into the base and tested. Here the bad commit turned out to be
> > 
> > e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20
> > 
> > which is related to gart. It turned out that the gart aperture on that
> > box is on another position with these patches. Before it was as
> > 0xa400 and now it is at 0xa000. It seems like this has something
> > to do with the root-cause.
> > 
> > Reverting commit 1a4a678b12c84db9ae5dce424e0e97f0559bb57c fixes the
> > problem btw. and booting with iommu=soft also works, but I have no idea
> > yet why the aperture at that address is a problem (with the patch
> > reverted the aperture lands at 0x8000).
> > 
> 
> Does reverting e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20 solve the
> problem for you?

No, reverting that patch doesn't make the problem go away (and the gart
aperture is still on 0xa000). I tested this in 39-rc3, I havn't
tested if it makes a difference on the original bisect-commit from Ingo,
probably it does (don't know if that matters).
Strange about this commit is that it fixes an x86 gart aperture
allocation bug in generic memblock code.

> 1a4a678b12c84db9ae5dce424e0e97f0559bb57c is a memory-allocation-order
> patch, which have a nasty tendency to unmask bugs elsewhere in the
> kernel.  However, e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20 looks
> positively strange (and it doesn't exactly help that the description is
> written in Yinghai-ese and is therefore nearly impossible to decode,
> never mind tell if it is remotely correct.)

I think that the two commits are okay and the bug is somewhere else, but
I have no idea yet were to look next. I spent some time looking at
radeon code and talking to Alex about it (because it seemed suspicous
that the GTT is on 0xa000 too, but as Alex explained me this is an
address in the GPU address space and shouldn't matter).

Regards,

   Joerg

Linux 2.6.39-rc3

2011-04-13 Thread Joerg Roedel

On Wed, Apr 13, 2011 at 11:39:29AM -0700, H. Peter Anvin wrote:
> On 04/13/2011 10:21 AM, Joerg Roedel wrote:
> > On Wed, Apr 13, 2011 at 08:46:09AM +0200, Ingo Molnar wrote:
> >> Could you please send the before/after bootlog (in particular all memory 
> >> init 
> >> messages included) and your .config?
> >>
> >>  before:  f005fe12b90c: x86-64: Move out cleanup higmap [_brk_end, _end) 
> >> out of init_memory_mapping()
> >>   after:  d2137d5af425: Merge branch 'linus' into x86/bootmem
> >>
> >> I've Cc:-ed more people who might have an idea about it.
> > 
> > Okay, I have done some more bisecting and debugging today.
> > 
> 
> First of all, *huge* thanks for this effort.  At least we need to track
> down the bits that need to be reverted -- it is past rc3, and it's time
> to see what we should revert and tell the submitter to try again next cycle.
> 
> This looks to be the same issue as in bugzilla 33012:
> 
>   https://bugzilla.kernel.org/show_bug.cgi?id=33012
> 
> ... so it would be good if we could keep the information in there.

Yes, I try to find my korg bugzilla account again and drop the
information from this email there.

Joerg

Linux 2.6.39-rc3

2011-04-13 Thread Joerg Roedel

On Wed, Apr 13, 2011 at 12:14:55PM -0700, Yinghai Lu wrote:
> thanks for the bisecting...
> 
> so those two patches uncover some problems.
> 
> [0.00] Checking aperture...
> [0.00] No AGP bridge found
> [0.00] Node 0: aperture @ a000 size 32 MB
> [0.00] Aperture pointing to e820 RAM. Ignoring.
> [0.00] Your BIOS doesn't leave a aperture memory hole
> [0.00] Please enable the IOMMU option in the BIOS setup
> [0.00] This costs you 64 MB of RAM
> [0.00] memblock_x86_reserve_range: [0xa000-0xa3ff]   
> aperture64
> [0.00] Mapping aperture over 65536 KB of RAM @ a000
> 
> so kernel try to reallocate apperture. because BIOS allocated is pointed to 
> RAM or size is too small.

It is actually beyond 4GB on that machine, this value read here is from
the previous kernel-boot. The BIOS does not reset these values on a
reboot.

> but your radeon does use [0xa000, 0xbfff)

Yes, I suspected that too (and spent a few hours reading radeon code),
but then I talked the Alex Deucher and he explained that these addresses
which the driver prints for GTT and VRAM are in the GPU address space
and do not refer to system ram. So this shouldn't be the problem.

Joerg

Linux 2.6.39-rc3

2011-04-13 Thread Alex Deucher

On Wed, Apr 13, 2011 at 3:14 PM, Yinghai Lu  wrote:
> On 04/13/2011 10:21 AM, Joerg Roedel wrote:
>> On Wed, Apr 13, 2011 at 08:46:09AM +0200, Ingo Molnar wrote:
>> First of all, I bisected between v2.6.37-rc2..f005fe12b90c which where
>> only a couple of patches and merged v2.6.38-rc4 in at every step. There
>> was no failure found.
>> Then I tried this again, but this time I merged v2.6.38-rc5 at every
>> step and was successful. The bad commit in this branch turned out to be
>>
>> ? ? ? 1a4a678b12c84db9ae5dce424e0e97f0559bb57c
>>
>> which is related to memblock.
>>
>> Then I tried to find out which change between 2.6.38-rc4 and 2.6.38-rc5
>> is needed to trigger the failure, so I used f005fe12b90c as a base,
>> bisected between v2.6.38-rc4..v2.6.38-rc5 and merged every bisect step
>> into the base and tested. Here the bad commit turned out to be
>>
>> ? ? ? e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20
>>
>> which is related to gart. It turned out that the gart aperture on that
>> box is on another position with these patches. Before it was as
>> 0xa400 and now it is at 0xa000. It seems like this has something
>> to do with the root-cause.
>>
>> Reverting commit 1a4a678b12c84db9ae5dce424e0e97f0559bb57c fixes the
>> problem btw. and booting with iommu=soft also works, but I have no idea
>> yet why the aperture at that address is a problem (with the patch
>> reverted the aperture lands at 0x8000).
>>
>> I have put some debug-data online. There is my .config and two
>> dmesg-files for good (==2.6.39-rc3 + revert) and bad (==2.6.39-rc3)
>> I also created these dmesg-files again with memblock=debug, maybe that
>> helps to find the problem. The files are at
>>
>> ? ? ? http://www.8bytes.org/~joro/debug/
>
> thanks for the bisecting...
>
> so those two patches uncover some problems.
>
> [ ? ?0.00] Checking aperture...
> [ ? ?0.00] No AGP bridge found
> [ ? ?0.00] Node 0: aperture @ a000 size 32 MB
> [ ? ?0.00] Aperture pointing to e820 RAM. Ignoring.
> [ ? ?0.00] Your BIOS doesn't leave a aperture memory hole
> [ ? ?0.00] Please enable the IOMMU option in the BIOS setup
> [ ? ?0.00] This costs you 64 MB of RAM
> [ ? ?0.00] ? ? memblock_x86_reserve_range: [0xa000-0xa3ff] ? ? ? 
> aperture64
> [ ? ?0.00] Mapping aperture over 65536 KB of RAM @ a000
>
> so kernel try to reallocate apperture. because BIOS allocated is pointed to 
> RAM or size is too small.
>
> but your radeon does use [0xa000, 0xbfff)
>
> [ ? ?4.281993] radeon :01:05.0: VRAM: 320M 0xC000 - 
> 0xD3FF (320M used)
> [ ? ?4.290672] radeon :01:05.0: GTT: 512M 0xA000 - 
> 0xBFFF
> [ ? ?4.298550] [drm] Detected VRAM RAM=320M, BAR=256M
> [ ? ?4.309857] [drm] RAM width 32bits DDR
> [ ? ?4.313748] [TTM] Zone ?kernel: Available graphics memory: 1896524 kiB.
> [ ? ?4.320379] [TTM] Initializing pool allocator.
> [ ? ?4.324948] [drm] radeon: 320M of VRAM memory ready
> [ ? ?4.329832] [drm] radeon: 512M of GTT memory ready.
>
> and the one seems working:
>
> [ ? ?0.00] Checking aperture...
> [ ? ?0.00] No AGP bridge found
> [ ? ?0.00] Node 0: aperture @ a000 size 32 MB
> [ ? ?0.00] Aperture pointing to e820 RAM. Ignoring.
> [ ? ?0.00] Your BIOS doesn't leave a aperture memory hole
> [ ? ?0.00] Please enable the IOMMU option in the BIOS setup
> [ ? ?0.00] This costs you 64 MB of RAM
> [ ? ?0.00] ? ? memblock_x86_reserve_range: [0x8000-0x83ff] ? ? ? 
> aperture64
> [ ? ?0.00] Mapping aperture over 65536 KB of RAM @ 8000
> [ ? ?0.00] ? ? memblock_x86_reserve_range: [0xacb6bdc0-0xacb6bddf] ? ? ? 
> ? ?BOOTMEM
>
> will use different position...
>
> [ ? ?4.250159] radeon :01:05.0: VRAM: 320M 0xC000 - 
> 0xD3FF (320M used)
> [ ? ?4.258830] radeon :01:05.0: GTT: 512M 0xA000 - 
> 0xBFFF
> [ ? ?4.266742] [drm] Detected VRAM RAM=320M, BAR=256M
> [ ? ?4.271549] [drm] RAM width 32bits DDR
> [ ? ?4.275435] [TTM] Zone ?kernel: Available graphics memory: 1896526 kiB.
> [ ? ?4.282066] [TTM] Initializing pool allocator.
> [ ? ?4.282085] usb 7-2: new full speed USB device number 2 using ohci_hcd
> [ ? ?4.293076] [drm] radeon: 320M of VRAM memory ready
> [ ? ?4.298277] [drm] radeon: 512M of GTT memory ready.
> [ ? ?4.303218] [drm] Supports vblank timestamp caching Rev 1 (10.10.2010).
> [ ? ?4.309854] [drm] Driver supports precise vblank timestamp query.
> [ ? ?4.315970] [drm] radeon: irq initialized.
> [ ? ?4.320094] [drm] GART: num cpu pages 131072, num gpu pages 131072
>
> So question is why radeon is using the address [0xa000 - 0xc00], and 
> in E820 it is RAM 

The VRAM and GTT addresses in the dmesg are internal GPU addresses not
system addresses.  The GPU has it's own internal address space for
on-chip memory clients (texture samplers, render buffers, display
controllers, etc.).  The GPU sets up two apertures in it's internal
addres

Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?

2011-04-13 Thread Andy Furniss

Michel D?nzer wrote:

>>> That does sound like the GPU locks up. Do you get any messages in dmesg
>>> about lockups and attempts to reset the GPU at any time?
>>
>> No.
>
> Hmm, I guess the constant SIGALRMs might prevent the lockup detection
> from kicking in... Maybe you can try starting the X server with
> -dumbSched to see if that gets things along any further, but in the end
> there's probably no way around figuring out what causes the lockup and
> fixing that anyway.

I have an old AGP box that locks with 600g + agpgart - It used to give 
GPU lockup to dmesg/log, but (I only test it occasionally) it doesn't 
anymore. I can still sysrq OK.

I wonder if something changed in recent months in the drm/whatever code 
that has changed/blocked the logging.

1 2 >

1 - 100 of 120 matches

Mail list logo