Re: [PATCH] drm/i915: restore only the mode of this driver on lastclose (v2)
On Wed, 13 Apr 2011 09:35:55 +1000, Dave Airlie wrote: > From: Dave Airlie > > i915 calls the panic handler function on last close to reset the modes, > however this is a really bad idea for multi-gpu machines, esp shareable > gpus machines. So add a new entry point for the driver to just restore > its own fbcon mode. > > v2: move code into fb helper, fix panic code to block mode change on > powered off GPUs. 2 bugs in one patch? This could be split into 3 steps... ;-) Aside from that, looks good. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Linux 2.6.39-rc3
* Joerg Roedel wrote: > > > The problem does not happen with 2.6.38. I try to bisect this further > > > down to a commit. Alex, please let me know if you need any further > > > information. > > > > If you can bisect it, that would be great. Thanks, > > Bisecting actually gave a very weird result. It points to > > d2137d5af4259f50c19addb8246a186c9ffac325 > > which is a merge-commit in the x86 tree. Even more weird is that this > notebook is the only machine with these symptoms, all my other boxes are > fine. > > During the bisect I tested commits from Yinghai which were good. It seems > like the problem appeared with the merge. There's a similar looking bug being debugged here: https://bugzilla.kernel.org/show_bug.cgi?id=33012 Could you please send the before/after bootlog (in particular all memory init messages included) and your .config? before: f005fe12b90c: x86-64: Move out cleanup higmap [_brk_end, _end) out of init_memory_mapping() after: d2137d5af425: Merge branch 'linus' into x86/bootmem I've Cc:-ed more people who might have an idea about it. Thanks, Ingo ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 34534] resolution 3840x1024 stopped to work on HD5850 after switch to 2.6.37 kernel
https://bugs.freedesktop.org/show_bug.cgi?id=34534 --- Comment #15 from Peter Hercek 2011-04-13 00:37:56 PDT --- Created an attachment (id=45562) --> (https://bugs.freedesktop.org/attachment.cgi?id=45562) xrandr --verbose output on 2.6.38.2-vanilla (with 3840x1024 fixed using radeonreg regset 0x770c 0x00020004) -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?
On Tue, Apr 12, 2011 at 07:29:22PM +0200, Michel Dänzer wrote: > On Die, 2011-04-12 at 14:00 +0200, Gabriel Paubert wrote: > > On Tue, Apr 12, 2011 at 01:46:10PM +0200, Michel Dänzer wrote: > > > > > > > > With no_wb=1 the driver goes a bit further but the X server ends > > > > up in an infinite ioctl loop and the logs are: > > > > > > Which ioctl does it loop on? Please provide the Xorg.0.log file as well. > > > > From memory, the code was 0x64, which is DRM_RADEON_GEM_WAIT_IDLE. > > Note that it's normal for this ioctl to be called every time before the > GPU accessible pixmap memory is accessed by the CPU. Unless the ioctl > always returns an error, this may not indicate a problem on its own. It seems to be an infinite loop, always returning EINTR because of regular SIGALRM delivery. > > > > The Xorg.0.log from the previous boot is attached. > > I don't see any obvious problems in it. Can you describe the symptoms of > the problem you're having with X a bit more? Well, X is dead, or rather in an infinite ioctl loop as described above. IIRC, the display enters a power-down mode and there is nothing to see. > > One thing I notice is that the X server/driver are rather oldish. Maybe > you can try newer versions from testing, sid or even experimental to see > if that makes any difference. I lack time to do it until early May (being away for 2 weeks starting on Friday and busy on urgent things). I'm indeed Debian stable (Squeeze), which is rather recent and the machine is about 2 1/2 years old. Gabriel ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?
On Tue, Apr 12, 2011 at 01:46:10PM +0200, Michel Dänzer wrote: > BTW, if your kernel contains commit > 69a07f0b117a40fcc1a479358d8e1f41793617f2, can you try if reverting that > helps? My kernel is pristine 2.6.38 and does not include this commit (was introduced before 2.6.39-rc1 according to gitk). Gabriel ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 34534] resolution 3840x1024 stopped to work on HD5850 after switch to 2.6.37 kernel
https://bugs.freedesktop.org/show_bug.cgi?id=34534 --- Comment #16 from Peter Hercek 2011-04-13 01:37:03 PDT --- (In reply to comment #14) > Does this patch help? No, the image stays corrupted, I still need to do this to fix it: # radeonreg regset 0x770c 0x00020004 OLD: 0x770c (770c)0x00010005 (65541) NEW: 0x770c (770c)0x00010004 (65540) # I applied and tested the patch with 2.6.38.2-vanilla. -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?
On Wed, 2011-04-13 at 09:59 +0200, Gabriel Paubert wrote: > > Well, X is dead, or rather in an infinite ioctl loop as described > above. > IIRC, the display enters a power-down mode and there is nothing to > see. So basically the card crashed. There's about an infinite amount of reasons why radeons do so, sometimes it has to do with them not liking what you ate that day... The only thing I can see that could be of use would be a bisect Cheers, Ben. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?
On Wed, Apr 13, 2011 at 06:16:13PM +1000, Benjamin Herrenschmidt wrote: > On Wed, 2011-04-13 at 09:59 +0200, Gabriel Paubert wrote: > > > > Well, X is dead, or rather in an infinite ioctl loop as described > > above. > > IIRC, the display enters a power-down mode and there is nothing to > > see. > > So basically the card crashed. There's about an infinite amount of > reasons why radeons do so, sometimes it has to do with them not liking > what you ate that day... > > The only thing I can see that could be of use would be a bisect Bisecting for something which I have never got to work (radeon with KMS) on this machine is something I don't know how to do... Note that radeon without KMS also always ends up crashing, but it may take hours. The only case where the machine works reliably is when glxinfo claims that it is using software rendering. Regards, Gabriel ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: small git lesson [Was: Re: Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?]
On Wed, Apr 13, 2011 at 10:59:14AM +0200, Andreas Schwab wrote: > Uwe Kleine-König writes: > > > $ git name-rev --refs=refs/tags/v2.6\* > > 69a07f0b117a40fcc1a479358d8e1f41793617f2 > > 69a07f0b117a40fcc1a479358d8e1f41793617f2 tags/v2.6.39-rc2~3^2~43^2~4 > > > > so it was introduced just before -rc2. > > $ git tag --contains 69a07f0b117a40fcc1a479358d8e1f41793617f2 > v2.6.39-rc1 > v2.6.39-rc2 > So who is right? I think it was before rc1. Anyway I'm aware that there are other git commands, although for the option details I often have to have a look at the man page. However in this case the main reason to fire gitk was to have a quick look at the patch and its context, and simply reported the "Precedes" line in the display, which is 2.6.39-rc1. It also follow v2.6.37-rc2, which means that it has been quite a long time outside the main tree. Gabriel ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 35502] Regression: black screen with Radeon KMS in 2.6.38 (2.6.37.4 worked fine)
https://bugs.freedesktop.org/show_bug.cgi?id=35502 Michel Dänzer changed: What|Removed |Added CC||br...@canonical.com --- Comment #9 from Michel Dänzer 2011-04-13 04:45:42 PDT --- *** Bug 36007 has been marked as a duplicate of this bug. *** -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] Big Endian support for RV730 (Mesa r600)
On Tue, 2011-04-12 at 10:01 +0200, Cédric Cano wrote: > Hi > > Here you are a patch that adds big endian support for rv730 in r600 > classic mesa driver. The BE modifications are almost the same as the DRM > / DDX driver modifications > (http://lists.freedesktop.org/archives/dri-devel/2011-February/008151.html). > > I used the mesa-demos to test the driver status on big endian platform. > Nearly all demos renders the same as on Intel architecture. > Nevertheless, there are still some issues in glReadPixels (r600_blit) > with some formats. I can't figure out exactly what and when data must be > swapped (set_tex_resoures, set_render_target...). Review of the patch > would be greatly appreciated. > > It seems that r600g will be the default for Mesa 7.11 so I'll try to > enable big endian support for Gallium now. Cool stuff ! I'll try to test that one of these days on various ppc's Cheers, Ben. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] Big Endian support for RV730 (Mesa r600)
On Wed, 2011-04-13 at 22:05 +1000, Benjamin Herrenschmidt wrote: > On Tue, 2011-04-12 at 10:01 +0200, Cédric Cano wrote: > > Hi > > > > Here you are a patch that adds big endian support for rv730 in r600 > > classic mesa driver. The BE modifications are almost the same as the DRM > > / DDX driver modifications > > (http://lists.freedesktop.org/archives/dri-devel/2011-February/008151.html). > > > > I used the mesa-demos to test the driver status on big endian platform. > > Nearly all demos renders the same as on Intel architecture. > > Nevertheless, there are still some issues in glReadPixels (r600_blit) > > with some formats. I can't figure out exactly what and when data must be > > swapped (set_tex_resoures, set_render_target...). Review of the patch > > would be greatly appreciated. > > > > It seems that r600g will be the default for Mesa 7.11 so I'll try to > > enable big endian support for Gallium now. > > Cool stuff ! > > I'll try to test that one of these days on various ppc's BTW. I see you used some FSL embedded board. Do you have your PCIe MMIO space above 32-bit ? Last I looked, there was a bunch of fixing needing to be done, among others in the TTM, to make that work. I had some preliminary patches but they bitrot... mostly the issue is to make sure than a phys_addr_t is used instead of an unsigned long whenever it tries to store the physical address of an object. Ben. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?
On Mit, 2011-04-13 at 09:59 +0200, Gabriel Paubert wrote: > On Tue, Apr 12, 2011 at 07:29:22PM +0200, Michel Dänzer wrote: > > On Die, 2011-04-12 at 14:00 +0200, Gabriel Paubert wrote: > > > On Tue, Apr 12, 2011 at 01:46:10PM +0200, Michel Dänzer wrote: > > > > > > > > > > With no_wb=1 the driver goes a bit further but the X server ends > > > > > up in an infinite ioctl loop and the logs are: > > > > > > > > Which ioctl does it loop on? Please provide the Xorg.0.log file as well. > > > > > > From memory, the code was 0x64, which is DRM_RADEON_GEM_WAIT_IDLE. > > > > Note that it's normal for this ioctl to be called every time before the > > GPU accessible pixmap memory is accessed by the CPU. Unless the ioctl > > always returns an error, this may not indicate a problem on its own. > > It seems to be an infinite loop, always returning EINTR because > of regular SIGALRM delivery. That does sound like the GPU locks up. Do you get any messages in dmesg about lockups and attempts to reset the GPU at any time? -- Earthling Michel Dänzer |http://www.vmware.com Libre software enthusiast | Debian, X and DRI developer ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?
On Wed, Apr 13, 2011 at 02:12:16PM +0200, Michel Dänzer wrote: > On Mit, 2011-04-13 at 09:59 +0200, Gabriel Paubert wrote: > > On Tue, Apr 12, 2011 at 07:29:22PM +0200, Michel Dänzer wrote: > > > On Die, 2011-04-12 at 14:00 +0200, Gabriel Paubert wrote: > > > > On Tue, Apr 12, 2011 at 01:46:10PM +0200, Michel Dänzer wrote: > > > > > > > > > > > > With no_wb=1 the driver goes a bit further but the X server ends > > > > > > up in an infinite ioctl loop and the logs are: > > > > > > > > > > Which ioctl does it loop on? Please provide the Xorg.0.log file as > > > > > well. > > > > > > > > From memory, the code was 0x64, which is DRM_RADEON_GEM_WAIT_IDLE. > > > > > > Note that it's normal for this ioctl to be called every time before the > > > GPU accessible pixmap memory is accessed by the CPU. Unless the ioctl > > > always returns an error, this may not indicate a problem on its own. > > > > It seems to be an infinite loop, always returning EINTR because > > of regular SIGALRM delivery. > > That does sound like the GPU locks up. Do you get any messages in dmesg > about lockups and attempts to reset the GPU at any time? No. Gabriel ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?
On Mit, 2011-04-13 at 14:27 +0200, Gabriel Paubert wrote: > On Wed, Apr 13, 2011 at 02:12:16PM +0200, Michel Dänzer wrote: > > On Mit, 2011-04-13 at 09:59 +0200, Gabriel Paubert wrote: > > > On Tue, Apr 12, 2011 at 07:29:22PM +0200, Michel Dänzer wrote: > > > > On Die, 2011-04-12 at 14:00 +0200, Gabriel Paubert wrote: > > > > > On Tue, Apr 12, 2011 at 01:46:10PM +0200, Michel Dänzer wrote: > > > > > > > > > > > > > > With no_wb=1 the driver goes a bit further but the X server ends > > > > > > > up in an infinite ioctl loop and the logs are: > > > > > > > > > > > > Which ioctl does it loop on? Please provide the Xorg.0.log file as > > > > > > well. > > > > > > > > > > From memory, the code was 0x64, which is DRM_RADEON_GEM_WAIT_IDLE. > > > > > > > > Note that it's normal for this ioctl to be called every time before the > > > > GPU accessible pixmap memory is accessed by the CPU. Unless the ioctl > > > > always returns an error, this may not indicate a problem on its own. > > > > > > It seems to be an infinite loop, always returning EINTR because > > > of regular SIGALRM delivery. > > > > That does sound like the GPU locks up. Do you get any messages in dmesg > > about lockups and attempts to reset the GPU at any time? > > No. Hmm, I guess the constant SIGALRMs might prevent the lockup detection from kicking in... Maybe you can try starting the X server with -dumbSched to see if that gets things along any further, but in the end there's probably no way around figuring out what causes the lockup and fixing that anyway. -- Earthling Michel Dänzer |http://www.vmware.com Libre software enthusiast | Debian, X and DRI developer ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] drm/radeon/kms: fix suspend on rv530 asics
On Tue, Apr 12, 2011 at 1:33 PM, Alex Deucher wrote: > Apparently only rv515 asics need the workaround > added in f24d86f1a49505cdea56728b853a5d0a3f8e3d11 > (drm/radeon/kms: fix resume regression for some r5xx laptops). > > Fixes: > https://bugs.freedesktop.org/show_bug.cgi?id=34709 > > Signed-off-by: Alex Deucher > Cc: sta...@kernel.org > --- > drivers/gpu/drm/radeon/atom.c | 6 +- > 1 files changed, 5 insertions(+), 1 deletions(-) > > diff --git a/drivers/gpu/drm/radeon/atom.c b/drivers/gpu/drm/radeon/atom.c > index 258fa5e..d71d375 100644 > --- a/drivers/gpu/drm/radeon/atom.c > +++ b/drivers/gpu/drm/radeon/atom.c > @@ -32,6 +32,7 @@ > #include "atom.h" > #include "atom-names.h" > #include "atom-bits.h" > +#include "radeon.h" > > #define ATOM_COND_ABOVE 0 > #define ATOM_COND_ABOVEOREQUAL 1 > @@ -101,7 +102,9 @@ static void debug_print_spaces(int n) > static uint32_t atom_iio_execute(struct atom_context *ctx, int base, > uint32_t index, uint32_t data) > { > + struct radeon_device *rdev = ctx->card->dev->dev_private; > uint32_t temp = 0xCDCDCDCD; > + > while (1) > switch (CU8(base)) { > case ATOM_IIO_NOP: > @@ -112,7 +115,8 @@ static uint32_t atom_iio_execute(struct atom_context > *ctx, int base, > base += 3; > break; > case ATOM_IIO_WRITE: > - (void)ctx->card->ioreg_read(ctx->card, CU16(base + > 1)); > + if (rdev->family == CHIP_RV515) > + (void)ctx->card->ioreg_read(ctx->card, > CU16(base + 1)); > ctx->card->ioreg_write(ctx->card, CU16(base + 1), > temp); > base += 3; > break; > -- > 1.7.1.1 > So this patch enable io write only for one family ? This looks utterly strange. Cheers, Jerome ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] drm/radeon/kms: fix suspend on rv530 asics
On Wed, Apr 13, 2011 at 10:46 AM, Jerome Glisse wrote: > On Tue, Apr 12, 2011 at 1:33 PM, Alex Deucher wrote: >> Apparently only rv515 asics need the workaround >> added in f24d86f1a49505cdea56728b853a5d0a3f8e3d11 >> (drm/radeon/kms: fix resume regression for some r5xx laptops). >> >> Fixes: >> https://bugs.freedesktop.org/show_bug.cgi?id=34709 >> >> Signed-off-by: Alex Deucher >> Cc: sta...@kernel.org >> --- >> drivers/gpu/drm/radeon/atom.c | 6 +- >> 1 files changed, 5 insertions(+), 1 deletions(-) >> >> diff --git a/drivers/gpu/drm/radeon/atom.c b/drivers/gpu/drm/radeon/atom.c >> index 258fa5e..d71d375 100644 >> --- a/drivers/gpu/drm/radeon/atom.c >> +++ b/drivers/gpu/drm/radeon/atom.c >> @@ -32,6 +32,7 @@ >> #include "atom.h" >> #include "atom-names.h" >> #include "atom-bits.h" >> +#include "radeon.h" >> >> #define ATOM_COND_ABOVE 0 >> #define ATOM_COND_ABOVEOREQUAL 1 >> @@ -101,7 +102,9 @@ static void debug_print_spaces(int n) >> static uint32_t atom_iio_execute(struct atom_context *ctx, int base, >> uint32_t index, uint32_t data) >> { >> + struct radeon_device *rdev = ctx->card->dev->dev_private; >> uint32_t temp = 0xCDCDCDCD; >> + >> while (1) >> switch (CU8(base)) { >> case ATOM_IIO_NOP: >> @@ -112,7 +115,8 @@ static uint32_t atom_iio_execute(struct atom_context >> *ctx, int base, >> base += 3; >> break; >> case ATOM_IIO_WRITE: >> - (void)ctx->card->ioreg_read(ctx->card, CU16(base + >> 1)); >> + if (rdev->family == CHIP_RV515) >> + (void)ctx->card->ioreg_read(ctx->card, >> CU16(base + 1)); >> ctx->card->ioreg_write(ctx->card, CU16(base + 1), >> temp); >> base += 3; >> break; >> -- >> 1.7.1.1 >> > > > So this patch enable io write only for one family ? This looks utterly > strange. No, it just does a read before write for rv515. I don't know why it needs it, but it seems to. Alex > > Cheers, > Jerome > ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 25588] Lots of ARB_vertex_program/fragment_program parser errors in ETQW (if GLSL is unavailable)
https://bugs.freedesktop.org/show_bug.cgi?id=25588 Fabio Pedretti changed: What|Removed |Added Resolution|WORKSFORME |WONTFIX Component|Mesa core |Drivers/DRI/r300 AssignedTo|mesa-dev@lists.freedesktop. |dri-devel@lists.freedesktop |org |.org -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 33222] New: [RADEON] Oops in worker thread for radeon_unpin_work_func
https://bugzilla.kernel.org/show_bug.cgi?id=33222 Summary: [RADEON] Oops in worker thread for radeon_unpin_work_func Product: Drivers Version: 2.5 Kernel Version: 2.6.38.2 Platform: All OS/Version: Linux Tree: Mainline Status: NEW Severity: low Priority: P1 Component: Video(DRI - non Intel) AssignedTo: drivers_video-...@kernel-bugs.osdl.org ReportedBy: tho...@m3y3r.de Regression: No Created an attachment (id=54282) --> (https://bugzilla.kernel.org/attachment.cgi?id=54282) Oops - Part 1 Few days ago I stumbled upon the attached oops. Just images. sorry for that. This is the first time I saw this oops. I just hit it once for 2.6.38. -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are watching the assignee of the bug. -- Forrester Wave Report - Recovery time is now measured in hours and minutes not days. Key insights are discussed in the 2010 Forrester Wave Report as part of an in-depth evaluation of disaster recovery service providers. Forrester found the best-in-class provider in terms of services and vision. Read this report now! http://p.sf.net/sfu/ibm-webcastpromo -- ___ Dri-devel mailing list dri-de...@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 33222] [RADEON] Oops in worker thread for radeon_unpin_work_func
https://bugzilla.kernel.org/show_bug.cgi?id=33222 --- Comment #1 from Thomas Meyer 2011-04-13 17:09:48 --- Created an attachment (id=54292) --> (https://bugzilla.kernel.org/attachment.cgi?id=54292) Oops - Part 2 -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are watching the assignee of the bug. -- Forrester Wave Report - Recovery time is now measured in hours and minutes not days. Key insights are discussed in the 2010 Forrester Wave Report as part of an in-depth evaluation of disaster recovery service providers. Forrester found the best-in-class provider in terms of services and vision. Read this report now! http://p.sf.net/sfu/ibm-webcastpromo -- ___ Dri-devel mailing list dri-de...@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
small git lesson [Was: Re: Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?]
On Wed, Apr 13, 2011 at 10:02:04AM +0200, Gabriel Paubert wrote: > On Tue, Apr 12, 2011 at 01:46:10PM +0200, Michel Dänzer wrote: > > BTW, if your kernel contains commit > > 69a07f0b117a40fcc1a479358d8e1f41793617f2, can you try if reverting that > > helps? > > My kernel is pristine 2.6.38 and does not include this commit > (was introduced before 2.6.39-rc1 according to gitk). gitk is not the best tool to find this out. $ git name-rev --refs=refs/tags/v2.6\* 69a07f0b117a40fcc1a479358d8e1f41793617f2 69a07f0b117a40fcc1a479358d8e1f41793617f2 tags/v2.6.39-rc2~3^2~43^2~4 so it was introduced just before -rc2. Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König| Industrial Linux Solutions | http://www.pengutronix.de/ | ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: small git lesson [Was: Re: Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?]
Uwe Kleine-König writes: > $ git name-rev --refs=refs/tags/v2.6\* > 69a07f0b117a40fcc1a479358d8e1f41793617f2 > 69a07f0b117a40fcc1a479358d8e1f41793617f2 tags/v2.6.39-rc2~3^2~43^2~4 > > so it was introduced just before -rc2. $ git tag --contains 69a07f0b117a40fcc1a479358d8e1f41793617f2 v2.6.39-rc1 v2.6.39-rc2 Andreas. -- Andreas Schwab, sch...@redhat.com GPG Key fingerprint = D4E8 DBE3 3813 BB5D FA84 5EC7 45C6 250E 6F00 984E "And now for something completely different." ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: small git lesson [Was: Re: Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?]
Hello Gabriel On Wed, Apr 13, 2011 at 12:31:44PM +0200, Gabriel Paubert wrote: > On Wed, Apr 13, 2011 at 10:59:14AM +0200, Andreas Schwab wrote: > > Uwe Kleine-König writes: > > > > > $ git name-rev --refs=refs/tags/v2.6\* > > > 69a07f0b117a40fcc1a479358d8e1f41793617f2 > > > 69a07f0b117a40fcc1a479358d8e1f41793617f2 tags/v2.6.39-rc2~3^2~43^2~4 > > > > > > so it was introduced just before -rc2. > > > > $ git tag --contains 69a07f0b117a40fcc1a479358d8e1f41793617f2 > > v2.6.39-rc1 > > v2.6.39-rc2 > > > > So who is right? I think it was before rc1. Yep, correct. I interpreted the output of git name-rev to mean it's not included in a tag earlier than v2.6.39-rc2, but actually that's wrong. It's just that it's easier (for some definition of easy) to reach the commit in question from v2.6.39-rc2 than from v2.6.39-rc1. > However in this case the main reason to fire gitk was to have a quick look > at the patch and its context, and simply reported the "Precedes" line > in the display, which is 2.6.39-rc1. It also follow v2.6.37-rc2, which means > that it has been quite a long time outside the main tree. I think this conclusion isn't valid in general. (E.g. in git itself a bug-fix is often done on top of the commit that introduced it and than merged into master. Still the bugfix might be new.) But looking at the AuthorDate of 69a07f0b117a seems to support your statement. Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König| Industrial Linux Solutions | http://www.pengutronix.de/ | ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 33222] [RADEON] Oops in worker thread for radeon_unpin_work_func
https://bugzilla.kernel.org/show_bug.cgi?id=33222 Alex Deucher changed: What|Removed |Added CC||alexdeuc...@gmail.com --- Comment #2 from Alex Deucher 2011-04-13 17:19:06 --- This is a duplicate of bug 32402. -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are watching the assignee of the bug. -- Forrester Wave Report - Recovery time is now measured in hours and minutes not days. Key insights are discussed in the 2010 Forrester Wave Report as part of an in-depth evaluation of disaster recovery service providers. Forrester found the best-in-class provider in terms of services and vision. Read this report now! http://p.sf.net/sfu/ibm-webcastpromo -- ___ Dri-devel mailing list dri-de...@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Linux 2.6.39-rc3
On Wed, Apr 13, 2011 at 08:46:09AM +0200, Ingo Molnar wrote: > Could you please send the before/after bootlog (in particular all memory init > messages included) and your .config? > > before: f005fe12b90c: x86-64: Move out cleanup higmap [_brk_end, _end) out > of init_memory_mapping() > after: d2137d5af425: Merge branch 'linus' into x86/bootmem > > I've Cc:-ed more people who might have an idea about it. Okay, I have done some more bisecting and debugging today. First of all, I bisected between v2.6.37-rc2..f005fe12b90c which where only a couple of patches and merged v2.6.38-rc4 in at every step. There was no failure found. Then I tried this again, but this time I merged v2.6.38-rc5 at every step and was successful. The bad commit in this branch turned out to be 1a4a678b12c84db9ae5dce424e0e97f0559bb57c which is related to memblock. Then I tried to find out which change between 2.6.38-rc4 and 2.6.38-rc5 is needed to trigger the failure, so I used f005fe12b90c as a base, bisected between v2.6.38-rc4..v2.6.38-rc5 and merged every bisect step into the base and tested. Here the bad commit turned out to be e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20 which is related to gart. It turned out that the gart aperture on that box is on another position with these patches. Before it was as 0xa400 and now it is at 0xa000. It seems like this has something to do with the root-cause. Reverting commit 1a4a678b12c84db9ae5dce424e0e97f0559bb57c fixes the problem btw. and booting with iommu=soft also works, but I have no idea yet why the aperture at that address is a problem (with the patch reverted the aperture lands at 0x8000). I have put some debug-data online. There is my .config and two dmesg-files for good (==2.6.39-rc3 + revert) and bad (==2.6.39-rc3) I also created these dmesg-files again with memblock=debug, maybe that helps to find the problem. The files are at http://www.8bytes.org/~joro/debug/ Or someone else has an idea about the issue... Joerg ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 30651] [RADEON:KMS:R600G] gl output in mplayer have no colors if used with a fragment program with additional lookup and bicubic B-spline filtering
https://bugs.freedesktop.org/show_bug.cgi?id=30651 --- Comment #7 from Andy Furniss 2011-04-13 10:23:46 PDT --- (In reply to comment #6) > 1) yuv=4 on r600g still have no colours even though with r300g they are ok yuv=4 with or without bicubic now works for me on 600g > 2) still there is an overbright glitch in some white places in some videos > with > yuv=6. but again, it may be a mplayer bug since it present with r300g too (but > not software rasterizer), i'm not sure. This is still the same. One general observation is the with 600g perf is poor compared to 600c or xv, which are at least 2x faster when benchmarking with HD streams. -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 30651] [RADEON:KMS:R600G] gl output in mplayer have no colors if used with a fragment program with additional lookup and bicubic B-spline filtering
https://bugs.freedesktop.org/show_bug.cgi?id=30651 --- Comment #8 from Sergey Kondakov 2011-04-13 10:46:29 PDT --- same here. and i never got answer about which method is better with amd/ati card and open stack now. i hope devs are looking into that stuff. -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 30651] [RADEON:KMS:R600G] gl output in mplayer have no colors if used with a fragment program with additional lookup and bicubic B-spline filtering
https://bugs.freedesktop.org/show_bug.cgi?id=30651 --- Comment #9 from Andy Furniss 2011-04-13 11:38:47 PDT --- (In reply to comment #8) > same here. > and i never got answer about which method is better with amd/ati card and open > stack now. i hope devs are looking into that stuff. Maybe there isn't an answer as such for that question. I guess someone with an on-board low spec GPU may be more limited than a high end card with fast vram. Quality wise - I can't see any difference, the higher yuv= numbers give more features like gamma correction (not sure how to use it though). It would be nice if 600g could beat or equal 600c - it does for 3D, but for some reason not this. I said classic was twice as fast - it's actually more than that if I discount time taken by the codec. -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 32982] Kernel locks up a few minutes after boot
https://bugzilla.kernel.org/show_bug.cgi?id=32982 --- Comment #6 from Bart Van Assche 2011-04-13 18:49:13 --- Although I'm still busy bisecting, I'd like to report that I got the following hung task report with head b73a21fc66fee35b41db755abebfacba48b2fc76 (had already seen something similar before with 2.6.39-rc2): INFO: task kjournald:918 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kjournald D 880131b9ddb8 0 918 2 0x 880131b9dd20 0046 880131b9dca0 8108cd6d 0282 880131b9dfd8 880137729f40 880131b9dfd8 880131b9c000 880131b9c000 880131b9c000 880131b9dfd8 Call Trace: [] ? trace_hardirqs_on_caller+0x14d/0x190 [] ? sub_preempt_count+0xa9/0xe0 [] journal_commit_transaction+0x13e/0x1590 [jbd] [] ? _raw_spin_unlock_irqrestore+0x65/0x80 [] ? sub_preempt_count+0xa9/0xe0 [] ? wake_up_bit+0x40/0x40 [] ? del_timer_sync+0x8a/0xc0 [] ? try_to_del_timer_sync+0x110/0x110 [] kjournald+0xf1/0x250 [jbd] [] ? wake_up_bit+0x40/0x40 [] ? commit_timeout+0x10/0x10 [jbd] [] kthread+0x96/0xa0 [] kernel_thread_helper+0x4/0x10 [] ? finish_task_switch+0x7b/0xe0 [] ? _raw_spin_unlock_irq+0x3b/0x60 [] ? retint_restore_args+0xe/0xe [] ? __init_kthread_worker+0x70/0x70 [] ? gs_change+0xb/0xb no locks held by kjournald/918. INFO: task klauncher:5744 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. klauncher D 0001000297b4 0 5744 5743 0x 88011dd73938 0046 8801 8108cbef 813e2535 88011dd73fd8 8801382e1f40 88011dd73fd8 88011dd72000 88011dd72000 88011dd72000 88011dd73fd8 Call Trace: [] ? mark_held_locks+0x6f/0xa0 [] ? _raw_spin_unlock_irqrestore+0x65/0x80 [] ? __wait_on_buffer+0x30/0x30 [] io_schedule+0x59/0x80 [] sleep_on_buffer+0xe/0x20 [] __wait_on_bit_lock+0x5a/0xc0 [] ? __wait_on_buffer+0x30/0x30 [] out_of_line_wait_on_bit_lock+0x78/0x90 [] ? autoremove_wake_function+0x50/0x50 [] __lock_buffer+0x36/0x40 [] do_get_write_access+0x64d/0x660 [jbd] [] ? sub_preempt_count+0xa9/0xe0 [] ? start_this_handle+0x370/0x470 [jbd] [] ? journal_add_journal_head+0xf4/0x220 [jbd] [] journal_get_write_access+0x31/0x50 [jbd] [] __ext3_journal_get_write_access+0x2d/0x60 [ext3] [] ext3_reserve_inode_write+0x83/0xb0 [ext3] [] ext3_mark_inode_dirty+0x44/0x70 [ext3] [] ext3_dirty_inode+0x5e/0xa0 [ext3] [] __mark_inode_dirty+0x3f/0x250 [] file_update_time+0xec/0x170 [] ? mutex_lock_nested+0x27d/0x3a0 [] __generic_file_aio_write+0x1f8/0x440 [] generic_file_aio_write+0x75/0xf0 [] do_sync_write+0xda/0x120 [] ? remove_vma+0x77/0x90 [] ? trace_hardirqs_on+0xd/0x10 [] ? remove_vma+0x77/0x90 [] vfs_write+0xc6/0x170 [] sys_write+0x51/0x90 [] system_call_fastpath+0x16/0x1b 2 locks held by klauncher/5744: #0: (&sb->s_type->i_mutex_key#11){+.+.+.}, at: [] generic_file_aio_write+0x59/0xf0 #1: (jbd_handle){+.+...}, at: [] start_this_handle+0x370/0x470 [jbd] INFO: task okular:4180 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. okular D 00010002a251 0 4180 5743 0x 880041d13aa8 0046 8800 8108cd6d 0282 880041d13fd8 880037a59f40 880041d13fd8 880041d12000 880041d12000 880041d12000 880041d13fd8 Call Trace: [] ? trace_hardirqs_on_caller+0x14d/0x190 [] start_this_handle+0x244/0x470 [jbd] [] ? is_module_address+0x33/0x60 [] ? wake_up_bit+0x40/0x40 [] journal_start+0xdb/0x120 [jbd] [] ext3_journal_start_sb+0x36/0x70 [ext3] [] ext3_setattr+0x1a3/0x210 [ext3] [] notify_change+0x116/0x360 [] do_truncate+0x63/0x90 [] ? sub_preempt_count+0xa9/0xe0 [] do_last+0x42c/0x820 [] path_openat+0xd0/0x410 [] ? might_fault+0x53/0xb0 [] do_filp_open+0x7f/0xa0 [] ? sub_preempt_count+0xa9/0xe0 [] ? _raw_spin_unlock+0x35/0x60 [] ? alloc_fd+0xf4/0x150 [] do_sys_open+0x101/0x1e0 [] sys_open+0x20/0x30 [] system_call_fastpath+0x16/0x1b 2 locks held by okular/4180: #0: (&sb->s_type->i_mutex_key#11){+.+.+.}, at: [] do_truncate+0x57/0x90 #1: (&sb->s_type->i_alloc_sem_key#4){+.+...}, at: [] notify_change+0x2a0/0x360 -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are watching the assignee of the bug. -- Forrester Wave Report - Recovery time is now measured in hours and minutes not days. Key insights are discussed in the 2010 Forrester Wave Report as part of an in-depth evaluation of disaster recovery service providers. Forrester found the best-in-class provider in terms of services and vision. Read this report now! http://p.sf.net/sfu/ibm-webcastpromo -- __
Re: Linux 2.6.39-rc3
On 04/13/2011 10:21 AM, Joerg Roedel wrote: > > First of all, I bisected between v2.6.37-rc2..f005fe12b90c which where > only a couple of patches and merged v2.6.38-rc4 in at every step. There > was no failure found. > Then I tried this again, but this time I merged v2.6.38-rc5 at every > step and was successful. The bad commit in this branch turned out to be > > 1a4a678b12c84db9ae5dce424e0e97f0559bb57c > > which is related to memblock. > > Then I tried to find out which change between 2.6.38-rc4 and 2.6.38-rc5 > is needed to trigger the failure, so I used f005fe12b90c as a base, > bisected between v2.6.38-rc4..v2.6.38-rc5 and merged every bisect step > into the base and tested. Here the bad commit turned out to be > > e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20 > > which is related to gart. It turned out that the gart aperture on that > box is on another position with these patches. Before it was as > 0xa400 and now it is at 0xa000. It seems like this has something > to do with the root-cause. > > Reverting commit 1a4a678b12c84db9ae5dce424e0e97f0559bb57c fixes the > problem btw. and booting with iommu=soft also works, but I have no idea > yet why the aperture at that address is a problem (with the patch > reverted the aperture lands at 0x8000). > Does reverting e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20 solve the problem for you? 1a4a678b12c84db9ae5dce424e0e97f0559bb57c is a memory-allocation-order patch, which have a nasty tendency to unmask bugs elsewhere in the kernel. However, e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20 looks positively strange (and it doesn't exactly help that the description is written in Yinghai-ese and is therefore nearly impossible to decode, never mind tell if it is remotely correct.) -hpa ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Linux 2.6.39-rc3
On 04/13/2011 10:21 AM, Joerg Roedel wrote: > On Wed, Apr 13, 2011 at 08:46:09AM +0200, Ingo Molnar wrote: > First of all, I bisected between v2.6.37-rc2..f005fe12b90c which where > only a couple of patches and merged v2.6.38-rc4 in at every step. There > was no failure found. > Then I tried this again, but this time I merged v2.6.38-rc5 at every > step and was successful. The bad commit in this branch turned out to be > > 1a4a678b12c84db9ae5dce424e0e97f0559bb57c > > which is related to memblock. > > Then I tried to find out which change between 2.6.38-rc4 and 2.6.38-rc5 > is needed to trigger the failure, so I used f005fe12b90c as a base, > bisected between v2.6.38-rc4..v2.6.38-rc5 and merged every bisect step > into the base and tested. Here the bad commit turned out to be > > e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20 > > which is related to gart. It turned out that the gart aperture on that > box is on another position with these patches. Before it was as > 0xa400 and now it is at 0xa000. It seems like this has something > to do with the root-cause. > > Reverting commit 1a4a678b12c84db9ae5dce424e0e97f0559bb57c fixes the > problem btw. and booting with iommu=soft also works, but I have no idea > yet why the aperture at that address is a problem (with the patch > reverted the aperture lands at 0x8000). > > I have put some debug-data online. There is my .config and two > dmesg-files for good (==2.6.39-rc3 + revert) and bad (==2.6.39-rc3) > I also created these dmesg-files again with memblock=debug, maybe that > helps to find the problem. The files are at > > http://www.8bytes.org/~joro/debug/ thanks for the bisecting... so those two patches uncover some problems. [0.00] Checking aperture... [0.00] No AGP bridge found [0.00] Node 0: aperture @ a000 size 32 MB [0.00] Aperture pointing to e820 RAM. Ignoring. [0.00] Your BIOS doesn't leave a aperture memory hole [0.00] Please enable the IOMMU option in the BIOS setup [0.00] This costs you 64 MB of RAM [0.00] memblock_x86_reserve_range: [0xa000-0xa3ff] aperture64 [0.00] Mapping aperture over 65536 KB of RAM @ a000 so kernel try to reallocate apperture. because BIOS allocated is pointed to RAM or size is too small. but your radeon does use [0xa000, 0xbfff) [4.281993] radeon :01:05.0: VRAM: 320M 0xC000 - 0xD3FF (320M used) [4.290672] radeon :01:05.0: GTT: 512M 0xA000 - 0xBFFF [4.298550] [drm] Detected VRAM RAM=320M, BAR=256M [4.309857] [drm] RAM width 32bits DDR [4.313748] [TTM] Zone kernel: Available graphics memory: 1896524 kiB. [4.320379] [TTM] Initializing pool allocator. [4.324948] [drm] radeon: 320M of VRAM memory ready [4.329832] [drm] radeon: 512M of GTT memory ready. and the one seems working: [0.00] Checking aperture... [0.00] No AGP bridge found [0.00] Node 0: aperture @ a000 size 32 MB [0.00] Aperture pointing to e820 RAM. Ignoring. [0.00] Your BIOS doesn't leave a aperture memory hole [0.00] Please enable the IOMMU option in the BIOS setup [0.00] This costs you 64 MB of RAM [0.00] memblock_x86_reserve_range: [0x8000-0x83ff] aperture64 [0.00] Mapping aperture over 65536 KB of RAM @ 8000 [0.00] memblock_x86_reserve_range: [0xacb6bdc0-0xacb6bddf] BOOTMEM will use different position... [4.250159] radeon :01:05.0: VRAM: 320M 0xC000 - 0xD3FF (320M used) [4.258830] radeon :01:05.0: GTT: 512M 0xA000 - 0xBFFF [4.266742] [drm] Detected VRAM RAM=320M, BAR=256M [4.271549] [drm] RAM width 32bits DDR [4.275435] [TTM] Zone kernel: Available graphics memory: 1896526 kiB. [4.282066] [TTM] Initializing pool allocator. [4.282085] usb 7-2: new full speed USB device number 2 using ohci_hcd [4.293076] [drm] radeon: 320M of VRAM memory ready [4.298277] [drm] radeon: 512M of GTT memory ready. [4.303218] [drm] Supports vblank timestamp caching Rev 1 (10.10.2010). [4.309854] [drm] Driver supports precise vblank timestamp query. [4.315970] [drm] radeon: irq initialized. [4.320094] [drm] GART: num cpu pages 131072, num gpu pages 131072 So question is why radeon is using the address [0xa000 - 0xc00], and in E820 it is RAM [0.00] BIOS-e820: 0010 - acb8d000 (usable) [0.00] BIOS-e820: acb8d000 - acb8f000 (reserved) [0.00] BIOS-e820: acb8f000 - afce9000 (usable) [0.00] BIOS-e820: afce9000 - afd21000 (reserved) [0.00] BIOS-e820: afd21000 - afd4f000 (usable) [0.00] BIOS-e820: afd4f000 - afdcf000 (reserved) [0.00] BIOS-e820: afdcf000
Re: Linux 2.6.39-rc3
On 04/13/2011 10:21 AM, Joerg Roedel wrote: > On Wed, Apr 13, 2011 at 08:46:09AM +0200, Ingo Molnar wrote: >> Could you please send the before/after bootlog (in particular all memory >> init >> messages included) and your .config? >> >> before: f005fe12b90c: x86-64: Move out cleanup higmap [_brk_end, _end) out >> of init_memory_mapping() >> after: d2137d5af425: Merge branch 'linus' into x86/bootmem >> >> I've Cc:-ed more people who might have an idea about it. > > Okay, I have done some more bisecting and debugging today. > First of all, *huge* thanks for this effort. At least we need to track down the bits that need to be reverted -- it is past rc3, and it's time to see what we should revert and tell the submitter to try again next cycle. This looks to be the same issue as in bugzilla 33012: https://bugzilla.kernel.org/show_bug.cgi?id=33012 ... so it would be good if we could keep the information in there. -hpa ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Linux 2.6.39-rc3
On Wed, Apr 13, 2011 at 11:51:39AM -0700, H. Peter Anvin wrote: > On 04/13/2011 10:21 AM, Joerg Roedel wrote: > > > > First of all, I bisected between v2.6.37-rc2..f005fe12b90c which where > > only a couple of patches and merged v2.6.38-rc4 in at every step. There > > was no failure found. > > Then I tried this again, but this time I merged v2.6.38-rc5 at every > > step and was successful. The bad commit in this branch turned out to be > > > > 1a4a678b12c84db9ae5dce424e0e97f0559bb57c > > > > which is related to memblock. > > > > Then I tried to find out which change between 2.6.38-rc4 and 2.6.38-rc5 > > is needed to trigger the failure, so I used f005fe12b90c as a base, > > bisected between v2.6.38-rc4..v2.6.38-rc5 and merged every bisect step > > into the base and tested. Here the bad commit turned out to be > > > > e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20 > > > > which is related to gart. It turned out that the gart aperture on that > > box is on another position with these patches. Before it was as > > 0xa400 and now it is at 0xa000. It seems like this has something > > to do with the root-cause. > > > > Reverting commit 1a4a678b12c84db9ae5dce424e0e97f0559bb57c fixes the > > problem btw. and booting with iommu=soft also works, but I have no idea > > yet why the aperture at that address is a problem (with the patch > > reverted the aperture lands at 0x8000). > > > > Does reverting e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20 solve the > problem for you? No, reverting that patch doesn't make the problem go away (and the gart aperture is still on 0xa000). I tested this in 39-rc3, I havn't tested if it makes a difference on the original bisect-commit from Ingo, probably it does (don't know if that matters). Strange about this commit is that it fixes an x86 gart aperture allocation bug in generic memblock code. > 1a4a678b12c84db9ae5dce424e0e97f0559bb57c is a memory-allocation-order > patch, which have a nasty tendency to unmask bugs elsewhere in the > kernel. However, e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20 looks > positively strange (and it doesn't exactly help that the description is > written in Yinghai-ese and is therefore nearly impossible to decode, > never mind tell if it is remotely correct.) I think that the two commits are okay and the bug is somewhere else, but I have no idea yet were to look next. I spent some time looking at radeon code and talking to Alex about it (because it seemed suspicous that the GTT is on 0xa000 too, but as Alex explained me this is an address in the GPU address space and shouldn't matter). Regards, Joerg ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Linux 2.6.39-rc3
On Wed, Apr 13, 2011 at 11:39:29AM -0700, H. Peter Anvin wrote: > On 04/13/2011 10:21 AM, Joerg Roedel wrote: > > On Wed, Apr 13, 2011 at 08:46:09AM +0200, Ingo Molnar wrote: > >> Could you please send the before/after bootlog (in particular all memory > >> init > >> messages included) and your .config? > >> > >> before: f005fe12b90c: x86-64: Move out cleanup higmap [_brk_end, _end) > >> out of init_memory_mapping() > >> after: d2137d5af425: Merge branch 'linus' into x86/bootmem > >> > >> I've Cc:-ed more people who might have an idea about it. > > > > Okay, I have done some more bisecting and debugging today. > > > > First of all, *huge* thanks for this effort. At least we need to track > down the bits that need to be reverted -- it is past rc3, and it's time > to see what we should revert and tell the submitter to try again next cycle. > > This looks to be the same issue as in bugzilla 33012: > > https://bugzilla.kernel.org/show_bug.cgi?id=33012 > > ... so it would be good if we could keep the information in there. Yes, I try to find my korg bugzilla account again and drop the information from this email there. Joerg ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Linux 2.6.39-rc3
On Wed, Apr 13, 2011 at 12:14:55PM -0700, Yinghai Lu wrote: > thanks for the bisecting... > > so those two patches uncover some problems. > > [0.00] Checking aperture... > [0.00] No AGP bridge found > [0.00] Node 0: aperture @ a000 size 32 MB > [0.00] Aperture pointing to e820 RAM. Ignoring. > [0.00] Your BIOS doesn't leave a aperture memory hole > [0.00] Please enable the IOMMU option in the BIOS setup > [0.00] This costs you 64 MB of RAM > [0.00] memblock_x86_reserve_range: [0xa000-0xa3ff] > aperture64 > [0.00] Mapping aperture over 65536 KB of RAM @ a000 > > so kernel try to reallocate apperture. because BIOS allocated is pointed to > RAM or size is too small. It is actually beyond 4GB on that machine, this value read here is from the previous kernel-boot. The BIOS does not reset these values on a reboot. > but your radeon does use [0xa000, 0xbfff) Yes, I suspected that too (and spent a few hours reading radeon code), but then I talked the Alex Deucher and he explained that these addresses which the driver prints for GTT and VRAM are in the GPU address space and do not refer to system ram. So this shouldn't be the problem. Joerg ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Linux 2.6.39-rc3
On Wed, Apr 13, 2011 at 3:14 PM, Yinghai Lu wrote: > On 04/13/2011 10:21 AM, Joerg Roedel wrote: >> On Wed, Apr 13, 2011 at 08:46:09AM +0200, Ingo Molnar wrote: >> First of all, I bisected between v2.6.37-rc2..f005fe12b90c which where >> only a couple of patches and merged v2.6.38-rc4 in at every step. There >> was no failure found. >> Then I tried this again, but this time I merged v2.6.38-rc5 at every >> step and was successful. The bad commit in this branch turned out to be >> >> 1a4a678b12c84db9ae5dce424e0e97f0559bb57c >> >> which is related to memblock. >> >> Then I tried to find out which change between 2.6.38-rc4 and 2.6.38-rc5 >> is needed to trigger the failure, so I used f005fe12b90c as a base, >> bisected between v2.6.38-rc4..v2.6.38-rc5 and merged every bisect step >> into the base and tested. Here the bad commit turned out to be >> >> e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20 >> >> which is related to gart. It turned out that the gart aperture on that >> box is on another position with these patches. Before it was as >> 0xa400 and now it is at 0xa000. It seems like this has something >> to do with the root-cause. >> >> Reverting commit 1a4a678b12c84db9ae5dce424e0e97f0559bb57c fixes the >> problem btw. and booting with iommu=soft also works, but I have no idea >> yet why the aperture at that address is a problem (with the patch >> reverted the aperture lands at 0x8000). >> >> I have put some debug-data online. There is my .config and two >> dmesg-files for good (==2.6.39-rc3 + revert) and bad (==2.6.39-rc3) >> I also created these dmesg-files again with memblock=debug, maybe that >> helps to find the problem. The files are at >> >> http://www.8bytes.org/~joro/debug/ > > thanks for the bisecting... > > so those two patches uncover some problems. > > [ 0.00] Checking aperture... > [ 0.00] No AGP bridge found > [ 0.00] Node 0: aperture @ a000 size 32 MB > [ 0.00] Aperture pointing to e820 RAM. Ignoring. > [ 0.00] Your BIOS doesn't leave a aperture memory hole > [ 0.00] Please enable the IOMMU option in the BIOS setup > [ 0.00] This costs you 64 MB of RAM > [ 0.00] memblock_x86_reserve_range: [0xa000-0xa3ff] > aperture64 > [ 0.00] Mapping aperture over 65536 KB of RAM @ a000 > > so kernel try to reallocate apperture. because BIOS allocated is pointed to > RAM or size is too small. > > but your radeon does use [0xa000, 0xbfff) > > [ 4.281993] radeon :01:05.0: VRAM: 320M 0xC000 - > 0xD3FF (320M used) > [ 4.290672] radeon :01:05.0: GTT: 512M 0xA000 - > 0xBFFF > [ 4.298550] [drm] Detected VRAM RAM=320M, BAR=256M > [ 4.309857] [drm] RAM width 32bits DDR > [ 4.313748] [TTM] Zone kernel: Available graphics memory: 1896524 kiB. > [ 4.320379] [TTM] Initializing pool allocator. > [ 4.324948] [drm] radeon: 320M of VRAM memory ready > [ 4.329832] [drm] radeon: 512M of GTT memory ready. > > and the one seems working: > > [ 0.00] Checking aperture... > [ 0.00] No AGP bridge found > [ 0.00] Node 0: aperture @ a000 size 32 MB > [ 0.00] Aperture pointing to e820 RAM. Ignoring. > [ 0.00] Your BIOS doesn't leave a aperture memory hole > [ 0.00] Please enable the IOMMU option in the BIOS setup > [ 0.00] This costs you 64 MB of RAM > [ 0.00] memblock_x86_reserve_range: [0x8000-0x83ff] > aperture64 > [ 0.00] Mapping aperture over 65536 KB of RAM @ 8000 > [ 0.00] memblock_x86_reserve_range: [0xacb6bdc0-0xacb6bddf] > BOOTMEM > > will use different position... > > [ 4.250159] radeon :01:05.0: VRAM: 320M 0xC000 - > 0xD3FF (320M used) > [ 4.258830] radeon :01:05.0: GTT: 512M 0xA000 - > 0xBFFF > [ 4.266742] [drm] Detected VRAM RAM=320M, BAR=256M > [ 4.271549] [drm] RAM width 32bits DDR > [ 4.275435] [TTM] Zone kernel: Available graphics memory: 1896526 kiB. > [ 4.282066] [TTM] Initializing pool allocator. > [ 4.282085] usb 7-2: new full speed USB device number 2 using ohci_hcd > [ 4.293076] [drm] radeon: 320M of VRAM memory ready > [ 4.298277] [drm] radeon: 512M of GTT memory ready. > [ 4.303218] [drm] Supports vblank timestamp caching Rev 1 (10.10.2010). > [ 4.309854] [drm] Driver supports precise vblank timestamp query. > [ 4.315970] [drm] radeon: irq initialized. > [ 4.320094] [drm] GART: num cpu pages 131072, num gpu pages 131072 > > So question is why radeon is using the address [0xa000 - 0xc00], and > in E820 it is RAM The VRAM and GTT addresses in the dmesg are internal GPU addresses not system addresses. The GPU has it's own internal address space for on-chip memory clients (texture samplers, render buffers, display controllers, etc.). The GPU sets up two apertures in it's internal addres
Re: [PATCH] drm/radeon/kms: fix suspend on rv530 asics
On Thu, Apr 14, 2011 at 12:52 AM, Alex Deucher wrote: > On Wed, Apr 13, 2011 at 10:46 AM, Jerome Glisse wrote: >> On Tue, Apr 12, 2011 at 1:33 PM, Alex Deucher wrote: >>> Apparently only rv515 asics need the workaround >>> added in f24d86f1a49505cdea56728b853a5d0a3f8e3d11 >>> (drm/radeon/kms: fix resume regression for some r5xx laptops). >>> >>> Fixes: >>> https://bugs.freedesktop.org/show_bug.cgi?id=34709 >>> >>> Signed-off-by: Alex Deucher >>> Cc: sta...@kernel.org >>> --- >>> drivers/gpu/drm/radeon/atom.c | 6 +- >>> 1 files changed, 5 insertions(+), 1 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/radeon/atom.c b/drivers/gpu/drm/radeon/atom.c >>> index 258fa5e..d71d375 100644 >>> --- a/drivers/gpu/drm/radeon/atom.c >>> +++ b/drivers/gpu/drm/radeon/atom.c >>> @@ -32,6 +32,7 @@ >>> #include "atom.h" >>> #include "atom-names.h" >>> #include "atom-bits.h" >>> +#include "radeon.h" >>> >>> #define ATOM_COND_ABOVE 0 >>> #define ATOM_COND_ABOVEOREQUAL 1 >>> @@ -101,7 +102,9 @@ static void debug_print_spaces(int n) >>> static uint32_t atom_iio_execute(struct atom_context *ctx, int base, >>> uint32_t index, uint32_t data) >>> { >>> + struct radeon_device *rdev = ctx->card->dev->dev_private; >>> uint32_t temp = 0xCDCDCDCD; >>> + >>> while (1) >>> switch (CU8(base)) { >>> case ATOM_IIO_NOP: >>> @@ -112,7 +115,8 @@ static uint32_t atom_iio_execute(struct atom_context >>> *ctx, int base, >>> base += 3; >>> break; >>> case ATOM_IIO_WRITE: >>> - (void)ctx->card->ioreg_read(ctx->card, CU16(base + >>> 1)); >>> + if (rdev->family == CHIP_RV515) >>> + (void)ctx->card->ioreg_read(ctx->card, >>> CU16(base + 1)); >>> ctx->card->ioreg_write(ctx->card, CU16(base + 1), >>> temp); >>> base += 3; >>> break; >>> -- >>> 1.7.1.1 >>> >> >> >> So this patch enable io write only for one family ? This looks utterly >> strange. > > No, it just does a read before write for rv515. I don't know why it > needs it, but it seems to. > Yeah I really wish I knew why either, Thinkpad T60 with X1300, no resume without this, it failed in the memory initialisation table. this was the only thing I could find to fix it. My x1300 desktop card works fine without this. Dave. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?
Michel Dänzer wrote: That does sound like the GPU locks up. Do you get any messages in dmesg about lockups and attempts to reset the GPU at any time? No. Hmm, I guess the constant SIGALRMs might prevent the lockup detection from kicking in... Maybe you can try starting the X server with -dumbSched to see if that gets things along any further, but in the end there's probably no way around figuring out what causes the lockup and fixing that anyway. I have an old AGP box that locks with 600g + agpgart - It used to give GPU lockup to dmesg/log, but (I only test it occasionally) it doesn't anymore. I can still sysrq OK. I wonder if something changed in recent months in the drm/whatever code that has changed/blocked the logging. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 36221] New: KMS with X1950 XT i2c error --> no ddc
https://bugs.freedesktop.org/show_bug.cgi?id=36221 Summary: KMS with X1950 XT i2c error --> no ddc Product: DRI Version: unspecified Platform: All OS/Version: All Status: NEW Severity: critical Priority: medium Component: DRM/Radeon AssignedTo: dri-devel@lists.freedesktop.org ReportedBy: revea...@freakmail.de Hello! This is the operating system and kernel: cat /etc/SuSE-release openSUSE 11.4 (i586) VERSION = 11.4 CODENAME = Celadon uname -rio 2.6.37.1-1.2-desktop i386 GNU/Linux When trying to boot with Kernelmodesetting there is no DDC due to an i2c error resulting in a blank screen. I was told in irc channel #radeon to open a bugreport and attache vbios.rom and dmesg of a boot with KMS enabled; I hope you can help me out! Many thanks for all your Help! Greetings, R -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 36221] KMS with X1950 XT i2c error --> no ddc
https://bugs.freedesktop.org/show_bug.cgi?id=36221 --- Comment #1 from revealed 2011-04-13 13:41:58 PDT --- Created an attachment (id=45589) --> (https://bugs.freedesktop.org/attachment.cgi?id=45589) vbios.rom -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 36221] KMS with X1950 XT i2c error --> no ddc
https://bugs.freedesktop.org/show_bug.cgi?id=36221 --- Comment #2 from revealed 2011-04-13 13:43:06 PDT --- Created an attachment (id=45590) --> (https://bugs.freedesktop.org/attachment.cgi?id=45590) Full dmesg containing the i2c error -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Linux 2.6.39-rc3
On 04/13/2011 12:34 PM, Joerg Roedel wrote: > On Wed, Apr 13, 2011 at 12:14:55PM -0700, Yinghai Lu wrote: >> thanks for the bisecting... >> >> so those two patches uncover some problems. >> >> [0.00] Checking aperture... >> [0.00] No AGP bridge found >> [0.00] Node 0: aperture @ a000 size 32 MB >> [0.00] Aperture pointing to e820 RAM. Ignoring. >> [0.00] Your BIOS doesn't leave a aperture memory hole >> [0.00] Please enable the IOMMU option in the BIOS setup >> [0.00] This costs you 64 MB of RAM >> [0.00] memblock_x86_reserve_range: [0xa000-0xa3ff] >> aperture64 >> [0.00] Mapping aperture over 65536 KB of RAM @ a000 >> >> so kernel try to reallocate apperture. because BIOS allocated is pointed to >> RAM or size is too small. > > It is actually beyond 4GB on that machine, this value read here is from > the previous kernel-boot. The BIOS does not reset these values on a > reboot. > >> but your radeon does use [0xa000, 0xbfff) > > Yes, I suspected that too (and spent a few hours reading radeon code), > but then I talked the Alex Deucher and he explained that these addresses > which the driver prints for GTT and VRAM are in the GPU address space > and do not refer to system ram. So this shouldn't be the problem. can you try following change ? it will push gart to 0x8000 diff --git a/arch/x86/kernel/aperture_64.c b/arch/x86/kernel/aperture_64.c index 86d1ad4..3b6a9d5 100644 --- a/arch/x86/kernel/aperture_64.c +++ b/arch/x86/kernel/aperture_64.c @@ -83,7 +83,7 @@ static u32 __init allocate_aperture(void) * so don't use 512M below as gart iommu, leave the space for kernel * code for safe */ - addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<20); + addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<21); if (addr == MEMBLOCK_ERROR || addr + aper_size > 0x) { printk(KERN_ERR "Cannot allocate aperture memory hole (%lx,%uK)\n", ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Linux 2.6.39-rc3
On Wed, Apr 13, 2011 at 1:48 PM, Yinghai Lu wrote: > > can you try following change ? it will push gart to 0x8000 > > diff --git a/arch/x86/kernel/aperture_64.c b/arch/x86/kernel/aperture_64.c > index 86d1ad4..3b6a9d5 100644 > --- a/arch/x86/kernel/aperture_64.c > +++ b/arch/x86/kernel/aperture_64.c > @@ -83,7 +83,7 @@ static u32 __init allocate_aperture(void) > * so don't use 512M below as gart iommu, leave the space for kernel > * code for safe > */ > - addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<20); > + addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<21); What are all the magic numbers, and why would 0x8000 be special? Why don't we write code that just works? Or absent a "just works" set of patches, why don't we revert to code that has years of testing? This kind of "I broke things, so now I will jiggle things randomly until they unbreak" is not acceptable. Either explain why that fixes a real BUG (and why the magic constants need to be what they are), or just revert the patch that caused the problem, and go back to the allocation patters that have years of experience. Guys, we've had this discussion before, in PCI allocation. We don't do this. We tried switching the PCI region allocations to top-down, and IT WAS A FAILURE. We reverted it to what we had years of testing with. Don't just make random changes. There really are only two acceptable models of development: "think and analyze" or "years and years of testing on thousands of machines". Those two really do work. Linus ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Linux 2.6.39-rc3
On 04/13/2011 01:54 PM, Linus Torvalds wrote: > On Wed, Apr 13, 2011 at 1:48 PM, Yinghai Lu wrote: >> >> can you try following change ? it will push gart to 0x8000 >> >> diff --git a/arch/x86/kernel/aperture_64.c b/arch/x86/kernel/aperture_64.c >> index 86d1ad4..3b6a9d5 100644 >> --- a/arch/x86/kernel/aperture_64.c >> +++ b/arch/x86/kernel/aperture_64.c >> @@ -83,7 +83,7 @@ static u32 __init allocate_aperture(void) >> * so don't use 512M below as gart iommu, leave the space for kernel >> * code for safe >> */ >> - addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<20); >> + addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<21); > > What are all the magic numbers, and why would 0x8000 be special? that is the old value when kernel was doing bottom-up bootmem allocation. > > Why don't we write code that just works? > > Or absent a "just works" set of patches, why don't we revert to code > that has years of testing? > > This kind of "I broke things, so now I will jiggle things randomly > until they unbreak" is not acceptable. > > Either explain why that fixes a real BUG (and why the magic constants > need to be what they are), or just revert the patch that caused the > problem, and go back to the allocation patters that have years of > experience. > > Guys, we've had this discussion before, in PCI allocation. We don't do > this. We tried switching the PCI region allocations to top-down, and > IT WAS A FAILURE. We reverted it to what we had years of testing with. > > Don't just make random changes. There really are only two acceptable > models of development: "think and analyze" or "years and years of > testing on thousands of machines". Those two really do work. We did do the analyzing, and only difference seems to be: good one is using 0x8000 and bad one is using 0xa000. We try to figure out if it needs low address and it happen to work because kernel was doing bottom up allocation. Thanks Yinghai ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Linux 2.6.39-rc3
On Wed, Apr 13, 2011 at 01:48:48PM -0700, Yinghai Lu wrote: > - addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<20); > + addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<21); Btw, while looking at this code I wondered why the 512M goal is enforced by the alignment. Start could be set to 512M instead and the alignment can be aper_size as it should. Any reason for such a big alignment? Joerg P.S.: The box is still in the office, I will try this debug-patch tomorrow. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Linux 2.6.39-rc3
On 04/13/2011 02:50 PM, Joerg Roedel wrote: > On Wed, Apr 13, 2011 at 01:48:48PM -0700, Yinghai Lu wrote: >> -addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<20); >> +addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<21); > > Btw, while looking at this code I wondered why the 512M goal is enforced > by the alignment. Start could be set to 512M instead and the alignment > can be aper_size as it should. Any reason for such a big alignment? > when using bootmem, try to use big alignment (512M ), so we could avoid take ram range below 512M. commit 7677b2ef6c0c4fddc84f6473f3863f40eb71821b Author: Yinghai Lu Date: Mon Apr 14 20:40:37 2008 -0700 x86_64: allocate gart aperture from 512M because we try to reserve dma32 early, so we have chance to get aperture from 64M. with some sequence aperture allocated from RAM, could become E820_RESERVED. and then if doing a kexec with a big kernel that uncompressed size is above 64M we could have a range conflict with still using gart. So allocate gart aperture from 512M instead. Also change the fallback_aper_order to 5, because we don't have chance to get 2G or 4G aperture. We can change it back to 32M or make it equal to size. > > P.S.: The box is still in the office, I will try this debug-patch > tomorrow. Alexandre's system is working at 0xa400 with 2.6.38.2 So it is not low address problem. could be other reason like some other code could need lower address. Thanks Yinghai ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Linux 2.6.39-rc3
On 04/13/2011 02:50 PM, Joerg Roedel wrote: > On Wed, Apr 13, 2011 at 01:48:48PM -0700, Yinghai Lu wrote: >> -addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<20); >> +addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<21); > > Btw, while looking at this code I wondered why the 512M goal is enforced > by the alignment. Start could be set to 512M instead and the alignment > can be aper_size as it should. Any reason for such a big alignment? > > Joerg > > P.S.: The box is still in the office, I will try this debug-patch > tomorrow. The only reason that I can think of is that the aperture itself can be huge, and perhaps 512 MiB is the biggest such known. 512ULL<<21 is of course a particularly moronic way to write 1 GiB, but it was a debug patch. The value 512 MiB apparently comes from 7677b2ef6c0c4fddc84f6473f3863f40eb71821b, which is apparently totally ad hoc; effectively it tries to prevent a collision with kexec by hardcoding the kdump allocation as it sat at that point in time in the GART assignment rules. Yeah. Brilliant. -hpa ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Linux 2.6.39-rc3
On 04/13/2011 02:59 PM, Yinghai Lu wrote: > On 04/13/2011 02:50 PM, Joerg Roedel wrote: >> On Wed, Apr 13, 2011 at 01:48:48PM -0700, Yinghai Lu wrote: >>> - addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<20); >>> + addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<21); >> >> Btw, while looking at this code I wondered why the 512M goal is enforced >> by the alignment. Start could be set to 512M instead and the alignment >> can be aper_size as it should. Any reason for such a big alignment? >> > > when using bootmem, try to use big alignment (512M ), so we could avoid take > ram range below 512M. > Yes, his question was why on Earth are you using 0 as start if that is the purpose. On top of that, where the hell does the magic 512 MiB come from? It looks like it is either completly ad hoc, or it has something to do with where the kexec kernel was allocated once upon a time. -hpa ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Linux 2.6.39-rc3
On Wed, Apr 13, 2011 at 03:01:10PM -0700, H. Peter Anvin wrote: > On 04/13/2011 02:50 PM, Joerg Roedel wrote: > > On Wed, Apr 13, 2011 at 01:48:48PM -0700, Yinghai Lu wrote: > >> - addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<20); > >> + addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<21); > > > > Btw, while looking at this code I wondered why the 512M goal is enforced > > by the alignment. Start could be set to 512M instead and the alignment > > can be aper_size as it should. Any reason for such a big alignment? > > > > Joerg > > > > P.S.: The box is still in the office, I will try this debug-patch > > tomorrow. > > The only reason that I can think of is that the aperture itself can be > huge, and perhaps 512 MiB is the biggest such known. Well, that would work as well by just using aper_size as alignment, the aperture needs to be aligned on its size anyway. This code only runs when Linux allocates the aperture itself and if I am mistaken is uses always 64MB when doing this. Joerg ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Linux 2.6.39-rc3
On 04/13/2011 03:22 PM, Joerg Roedel wrote: > On Wed, Apr 13, 2011 at 03:01:10PM -0700, H. Peter Anvin wrote: >> On 04/13/2011 02:50 PM, Joerg Roedel wrote: >>> On Wed, Apr 13, 2011 at 01:48:48PM -0700, Yinghai Lu wrote: - addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<20); + addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<21); >>> >>> Btw, while looking at this code I wondered why the 512M goal is enforced >>> by the alignment. Start could be set to 512M instead and the alignment >>> can be aper_size as it should. Any reason for such a big alignment? >>> >>> Joerg >>> >>> P.S.: The box is still in the office, I will try this debug-patch >>> tomorrow. >> >> The only reason that I can think of is that the aperture itself can be >> huge, and perhaps 512 MiB is the biggest such known. > > Well, that would work as well by just using aper_size as alignment, the > aperture needs to be aligned on its size anyway. This code only runs > when Linux allocates the aperture itself and if I am mistaken is uses > always 64MB when doing this. Yes, I would agree with that. The sane thing would be to set the base to whatever address needs to be guarded against (WHICH SHOULD BE MOTIVATED), and use aper_size as alignment, *unless* we are only using the initial portion of a much larger hardware structure that needs natural alignment (which isn't clear to me, I do know we sometimes use only a fraction of the GART, but that doesn't mean we need to naturally-align the entire thing, nor that 512 MiB is sufficient to do so.) -hpa ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Linux 2.6.39-rc3
On Wed, Apr 13, 2011 at 2:23 PM, Yinghai Lu wrote: >> >> What are all the magic numbers, and why would 0x8000 be special? > > that is the old value when kernel was doing bottom-up bootmem allocation. I understand, BUT THAT IS STILL A TOTALLY MAGIC NUMBER! It makes it come out the same ON THAT ONE MACHINE. So no, it's not "the old value". It's a random value that gets the old value in one specific case. >> Why don't we write code that just works? >> >> Or absent a "just works" set of patches, why don't we revert to code >> that has years of testing? >> >> This kind of "I broke things, so now I will jiggle things randomly >> until they unbreak" is not acceptable. >> >> Either explain why that fixes a real BUG (and why the magic constants >> need to be what they are), or just revert the patch that caused the >> problem, and go back to the allocation patters that have years of >> experience. >> >> Guys, we've had this discussion before, in PCI allocation. We don't do >> this. We tried switching the PCI region allocations to top-down, and >> IT WAS A FAILURE. We reverted it to what we had years of testing with. >> >> Don't just make random changes. There really are only two acceptable >> models of development: "think and analyze" or "years and years of >> testing on thousands of machines". Those two really do work. > > We did do the analyzing, and only difference seems to be: No. Yinghai, we have had this discussion before, and dammit, you need to understand the difference between "understanding the problem" and "put in random values until it works on one machine". There was absolutely _zero_ analysis done. You do not actually understand WHY the numbers matter. You just look at two random numbers, and one works, the other does not. That's not "analyzing". That's just "random number games". If you cannot see and understand the difference between an actual analytical solution where you _understand_ what the code is doing and why, and "random numbers that happen to work on one machine", I don't know what to tell you. > good one is using 0x8000 > and bad one is using 0xa000. > > We try to figure out if it needs low address and it happen to work > because kernel was doing bottom up allocation. No. Let me repeat my point one more time. You have TWO choices. Not more, not less: - choice #1: go back to the old allocation model. It's tested. It doesn't regress. Admittedly we may not know exactly _why_ it works, and it might not work on all machines, but it doesn't cause regressions (ie the machines it doesn't work on it _never_ worked on). And this doesn't mean "old value for that _one_ machine". It means "old value for _every_ machine". So it means we revert the whole bottom-down thing entirely. Not just "change one random number so that the totally different allocation pattern happens to give the same result on one particular machine". Quite frankly, I don't see the point of doing top-to-bottom anyway, so I think we should do this regardless. Just revert the whole "allocate from top". It didn't work for PCI, it's not working for this case either. Stop doing it. - Choice #2: understand exactly _what_ goes wrong, and fix it analytically (ie by _understanding_ the problem, and being able to solve it exactly, and in a way you can argue about without having to resort to "magic happens"). Now, the whole analytic approach (aka "computer sciency" approach), where you can actually think about the problem without having any pesky "reality" impact the solution is obviously the one we tend to prefer. Sadly, it's seldom the one we can use in reality when it comes to things like resource allocation, since we end up starting off with often buggy approximations of what the actual hardware is all about (ie broken firmware tables). So I'd love to know exactly why one random number works, and why another one doesn't. But as long as we do _not_ know the "Why" of it, we will have to revert. It really is that simple. It's _always_ that simple. So the numbers shouldn't be "magic", they should have real explanations. And in the absense of real explanation, the model that works is "this is what we've always done". Including, very much, the whole allocation order. Not just one random number on one random machine. Linus ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Linux 2.6.39-rc3
On 04/13/2011 04:39 PM, Linus Torvalds wrote: > On Wed, Apr 13, 2011 at 2:23 PM, Yinghai Lu wrote: >>> >>> What are all the magic numbers, and why would 0x8000 be special? >> >> that is the old value when kernel was doing bottom-up bootmem allocation. > > I understand, BUT THAT IS STILL A TOTALLY MAGIC NUMBER! > > It makes it come out the same ON THAT ONE MACHINE. So no, it's not > "the old value". It's a random value that gets the old value in one > specific case. Alexandre's system is working 2.6.38.2 and kernel allocate from 0xa400 Joerg's system working 2.6.39-rc3 while revert the top down bootmem patch 1a4a678b12c84db9ae5dce424e0e97f0559bb57c and kernel allocate to 0x8000. Alexandre's system is working while increasing alignment to 1g, and make kernel to allocate 0x8000 to gart. they are not working if kernel allocate from 0xa000 the 0xa000 looks like same value from radon GTT. [4.250159] radeon :01:05.0: VRAM: 320M 0xC000 - 0xD3FF (320M used) [4.258830] radeon :01:05.0: GTT: 512M 0xA000 - 0xBFFF [4.266742] [drm] Detected VRAM RAM=320M, BAR=256M [4.271549] [drm] RAM width 32bits DDR [4.275435] [TTM] Zone kernel: Available graphics memory: 1896526 kiB. [4.282066] [TTM] Initializing pool allocator. [4.282085] usb 7-2: new full speed USB device number 2 using ohci_hcd [4.293076] [drm] radeon: 320M of VRAM memory ready [4.298277] [drm] radeon: 512M of GTT memory ready. [4.303218] [drm] Supports vblank timestamp caching Rev 1 (10.10.2010). [4.309854] [drm] Driver supports precise vblank timestamp query. [4.315970] [drm] radeon: irq initialized. [4.320094] [drm] GART: num cpu pages 131072, num gpu pages 131072 Alex said that 0xa000 is ok and is from GPU address space --- The VRAM and GTT addresses in the dmesg are internal GPU addresses not system addresses. The GPU has it's own internal address space for on-chip memory clients (texture samplers, render buffers, display controllers, etc.). The GPU sets up two apertures in it's internal address space and on-chip client requests are forwarded to the appropriate place by the GPU's memory controller. Addresses in the GPU's VRAM aperture go to local vram on discrete cards, or to the stolen memory at the top of system memory for IGP cards. Addresses in the GPU's GTT aperture hit a page table and get forwarded to the appropriate dma pages. --- > >>> Why don't we write code that just works? >>> >>> Or absent a "just works" set of patches, why don't we revert to code >>> that has years of testing? >>> >>> This kind of "I broke things, so now I will jiggle things randomly >>> until they unbreak" is not acceptable. >>> >>> Either explain why that fixes a real BUG (and why the magic constants >>> need to be what they are), or just revert the patch that caused the >>> problem, and go back to the allocation patters that have years of >>> experience. >>> >>> Guys, we've had this discussion before, in PCI allocation. We don't do >>> this. We tried switching the PCI region allocations to top-down, and >>> IT WAS A FAILURE. We reverted it to what we had years of testing with. >>> >>> Don't just make random changes. There really are only two acceptable >>> models of development: "think and analyze" or "years and years of >>> testing on thousands of machines". Those two really do work. >> >> We did do the analyzing, and only difference seems to be: > > No. > > Yinghai, we have had this discussion before, and dammit, you need to > understand the difference between "understanding the problem" and "put > in random values until it works on one machine". > > There was absolutely _zero_ analysis done. You do not actually > understand WHY the numbers matter. You just look at two random > numbers, and one works, the other does not. That's not "analyzing". > That's just "random number games". > > If you cannot see and understand the difference between an actual > analytical solution where you _understand_ what the code is doing and > why, and "random numbers that happen to work on one machine", I don't > know what to tell you. > >> good one is using 0x8000 >> and bad one is using 0xa000. >> >> We try to figure out if it needs low address and it happen to work >> because kernel was doing bottom up allocation. > > No. > > Let me repeat my point one more time. > > You have TWO choices. Not more, not less: > > - choice #1: go back to the old allocation model. It's tested. It > doesn't regress. Admittedly we may not know exactly _why_ it works, > and it might not work on all machines, but it doesn't cause > regressions (ie the machines it doesn't work on it _never_ worked on). > >And this doesn't mean "old value for that _one_ machine". It means > "old value for _every_ machine". So it means we revert the whole > bottom-down thing entirely. Not just "change one random number so that > the totally different
Re: Linux 2.6.39-rc3
On 04/13/2011 12:14 PM, Yinghai Lu wrote: > > so those two patches uncover some problems. > > [0.00] Checking aperture... > [0.00] No AGP bridge found > [0.00] Node 0: aperture @ a000 size 32 MB > [0.00] Aperture pointing to e820 RAM. Ignoring. > [0.00] Your BIOS doesn't leave a aperture memory hole > [0.00] Please enable the IOMMU option in the BIOS setup > [0.00] This costs you 64 MB of RAM > [0.00] memblock_x86_reserve_range: [0xa000-0xa3ff] > aperture64 > [0.00] Mapping aperture over 65536 KB of RAM @ a000 > > so kernel try to reallocate apperture. because BIOS allocated is pointed to > RAM or size is too small. > > but your radeon does use [0xa000, 0xbfff) > > [4.281993] radeon :01:05.0: VRAM: 320M 0xC000 - > 0xD3FF (320M used) > [4.290672] radeon :01:05.0: GTT: 512M 0xA000 - > 0xBFFF > [4.298550] [drm] Detected VRAM RAM=320M, BAR=256M > [4.309857] [drm] RAM width 32bits DDR > [4.313748] [TTM] Zone kernel: Available graphics memory: 1896524 kiB. > [4.320379] [TTM] Initializing pool allocator. > [4.324948] [drm] radeon: 320M of VRAM memory ready > [4.329832] [drm] radeon: 512M of GTT memory ready. > > and the one seems working: > > [0.00] Checking aperture... > [0.00] No AGP bridge found > [0.00] Node 0: aperture @ a000 size 32 MB > [0.00] Aperture pointing to e820 RAM. Ignoring. > [0.00] Your BIOS doesn't leave a aperture memory hole > [0.00] Please enable the IOMMU option in the BIOS setup > [0.00] This costs you 64 MB of RAM > [0.00] memblock_x86_reserve_range: [0x8000-0x83ff] > aperture64 > [0.00] Mapping aperture over 65536 KB of RAM @ 8000 > [0.00] memblock_x86_reserve_range: [0xacb6bdc0-0xacb6bddf] > BOOTMEM > > will use different position... > > [4.250159] radeon :01:05.0: VRAM: 320M 0xC000 - > 0xD3FF (320M used) > [4.258830] radeon :01:05.0: GTT: 512M 0xA000 - > 0xBFFF > [4.266742] [drm] Detected VRAM RAM=320M, BAR=256M > [4.271549] [drm] RAM width 32bits DDR > [4.275435] [TTM] Zone kernel: Available graphics memory: 1896526 kiB. > [4.282066] [TTM] Initializing pool allocator. > [4.282085] usb 7-2: new full speed USB device number 2 using ohci_hcd > [4.293076] [drm] radeon: 320M of VRAM memory ready > [4.298277] [drm] radeon: 512M of GTT memory ready. > [4.303218] [drm] Supports vblank timestamp caching Rev 1 (10.10.2010). > [4.309854] [drm] Driver supports precise vblank timestamp query. > [4.315970] [drm] radeon: irq initialized. > [4.320094] [drm] GART: num cpu pages 131072, num gpu pages 131072 > > So question is why radeon is using the address [0xa000 - 0xc00], and > in E820 it is RAM > > [0.00] BIOS-e820: 0010 - acb8d000 (usable) > [0.00] BIOS-e820: acb8d000 - acb8f000 (reserved) > [0.00] BIOS-e820: acb8f000 - afce9000 (usable) > [0.00] BIOS-e820: afce9000 - afd21000 (reserved) > [0.00] BIOS-e820: afd21000 - afd4f000 (usable) > [0.00] BIOS-e820: afd4f000 - afdcf000 (reserved) > [0.00] BIOS-e820: afdcf000 - afecf000 (ACPI NVS) > [0.00] BIOS-e820: afecf000 - afeff000 (ACPI data) > [0.00] BIOS-e820: afeff000 - aff0 (usable) > > so looks bios program wrong address to the radon card? > Okay, staring at this, it definitely seems toxic to overlay the GART over memory areas reserved by the BIOS. If I were to guess, I would say that the problem here seems to be that the kernel thinks it is overlaying 64 MiB of memory, but the actual GART is in fact 512 MiB in size -- 131072 CPU pages -- which now overlaps the BIOS reserved areas. Alex D., could you comment on the "num cpu pages" bit? -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Linux 2.6.39-rc3
On 04/13/2011 04:39 PM, Linus Torvalds wrote: > > - Choice #2: understand exactly _what_ goes wrong, and fix it > analytically (ie by _understanding_ the problem, and being able to > solve it exactly, and in a way you can argue about without having to > resort to "magic happens"). > > Now, the whole analytic approach (aka "computer sciency" approach), > where you can actually think about the problem without having any > pesky "reality" impact the solution is obviously the one we tend to > prefer. Sadly, it's seldom the one we can use in reality when it comes > to things like resource allocation, since we end up starting off with > often buggy approximations of what the actual hardware is all about > (ie broken firmware tables). > > So I'd love to know exactly why one random number works, and why > another one doesn't. But as long as we do _not_ know the "Why" of it, > we will have to revert. > Yes. However, even if we *do* revert (and the time is running short on not reverting) I would like to understand this particular one, simply because I think it may very well be a problem that is manifesting itself in other ways on other systems. The other thing that this has uncovered is that we already have a bunch of complete b*llsh*t magic numbers in this path, some of which are trivially shown to be wrong or at least completely arbitrary, so there are more issues here :( -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Linux 2.6.39-rc3
On Wed, 2011-04-13 at 18:58 -0700, H. Peter Anvin wrote: > On 04/13/2011 12:14 PM, Yinghai Lu wrote: > > > > so those two patches uncover some problems. > > > > [0.00] Checking aperture... > > [0.00] No AGP bridge found > > [0.00] Node 0: aperture @ a000 size 32 MB > > [0.00] Aperture pointing to e820 RAM. Ignoring. > > [0.00] Your BIOS doesn't leave a aperture memory hole > > [0.00] Please enable the IOMMU option in the BIOS setup > > [0.00] This costs you 64 MB of RAM > > [0.00] memblock_x86_reserve_range: [0xa000-0xa3ff] > > aperture64 > > [0.00] Mapping aperture over 65536 KB of RAM @ a000 > > > > so kernel try to reallocate apperture. because BIOS allocated is pointed to > > RAM or size is too small. > > > > but your radeon does use [0xa000, 0xbfff) > > > > [4.281993] radeon :01:05.0: VRAM: 320M 0xC000 - > > 0xD3FF (320M used) > > [4.290672] radeon :01:05.0: GTT: 512M 0xA000 - > > 0xBFFF > > [4.298550] [drm] Detected VRAM RAM=320M, BAR=256M > > [4.309857] [drm] RAM width 32bits DDR > > [4.313748] [TTM] Zone kernel: Available graphics memory: 1896524 kiB. > > [4.320379] [TTM] Initializing pool allocator. > > [4.324948] [drm] radeon: 320M of VRAM memory ready > > [4.329832] [drm] radeon: 512M of GTT memory ready. > > > > and the one seems working: > > > > [0.00] Checking aperture... > > [0.00] No AGP bridge found > > [0.00] Node 0: aperture @ a000 size 32 MB > > [0.00] Aperture pointing to e820 RAM. Ignoring. > > [0.00] Your BIOS doesn't leave a aperture memory hole > > [0.00] Please enable the IOMMU option in the BIOS setup > > [0.00] This costs you 64 MB of RAM > > [0.00] memblock_x86_reserve_range: [0x8000-0x83ff] > > aperture64 > > [0.00] Mapping aperture over 65536 KB of RAM @ 8000 > > [0.00] memblock_x86_reserve_range: [0xacb6bdc0-0xacb6bddf] > > BOOTMEM > > > > will use different position... > > > > [4.250159] radeon :01:05.0: VRAM: 320M 0xC000 - > > 0xD3FF (320M used) > > [4.258830] radeon :01:05.0: GTT: 512M 0xA000 - > > 0xBFFF > > [4.266742] [drm] Detected VRAM RAM=320M, BAR=256M > > [4.271549] [drm] RAM width 32bits DDR > > [4.275435] [TTM] Zone kernel: Available graphics memory: 1896526 kiB. > > [4.282066] [TTM] Initializing pool allocator. > > [4.282085] usb 7-2: new full speed USB device number 2 using ohci_hcd > > [4.293076] [drm] radeon: 320M of VRAM memory ready > > [4.298277] [drm] radeon: 512M of GTT memory ready. > > [4.303218] [drm] Supports vblank timestamp caching Rev 1 (10.10.2010). > > [4.309854] [drm] Driver supports precise vblank timestamp query. > > [4.315970] [drm] radeon: irq initialized. > > [4.320094] [drm] GART: num cpu pages 131072, num gpu pages 131072 > > > > So question is why radeon is using the address [0xa000 - 0xc00], > > and in E820 it is RAM > > > > [0.00] BIOS-e820: 0010 - acb8d000 (usable) > > [0.00] BIOS-e820: acb8d000 - acb8f000 (reserved) > > [0.00] BIOS-e820: acb8f000 - afce9000 (usable) > > [0.00] BIOS-e820: afce9000 - afd21000 (reserved) > > [0.00] BIOS-e820: afd21000 - afd4f000 (usable) > > [0.00] BIOS-e820: afd4f000 - afdcf000 (reserved) > > [0.00] BIOS-e820: afdcf000 - afecf000 (ACPI NVS) > > [0.00] BIOS-e820: afecf000 - afeff000 (ACPI data) > > [0.00] BIOS-e820: afeff000 - aff0 (usable) > > > > so looks bios program wrong address to the radon card? > > > > Okay, staring at this, it definitely seems toxic to overlay the GART > over memory areas reserved by the BIOS. If I were to guess, I would say > that the problem here seems to be that the kernel thinks it is > overlaying 64 MiB of memory, but the actual GART is in fact 512 MiB in > size -- 131072 CPU pages -- which now overlaps the BIOS reserved areas. > > Alex D., could you comment on the "num cpu pages" bit? These are not CPU addresses. I think we've stated that already. Not the droids. the num cpu pages is how many CPU pages would be needed to fill the GPU GTT, for those crazy cases where CPU pagesize != GPU pagesize. Dave. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Linux 2.6.39-rc3
On Wednesday, April 13, 2011, H. Peter Anvin wrote: > > Yes. However, even if we *do* revert (and the time is running short on > not reverting) I would like to understand this particular one, simply > because I think it may very well be a problem that is manifesting itself > in other ways on other systems. > > The other thing that this has uncovered is that we already have a bunch > of complete b*llsh*t magic numbers in this ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Linux 2.6.39-rc3
On Wednesday, April 13, 2011, Linus Torvalds wrote: > On Wednesday, April 13, 2011, H. Peter Anvin wrote: >> >> Yes. However, even if we *do* revert (and the time is running short on >> not reverting) I would like to understand this particular one, simply >> because I think it may very well be a problem that is manifesting itself >> in other ways on other systems. sorry, fingerfart. Anyway, I agree 100%. we definitely want to also understand the reason for things not working, even if we do revert.. Linus >> of complete b*llsh*t magic numbers in this > ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Linux 2.6.39-rc3
Hello, On Wed, Apr 13, 2011 at 07:33:40PM -0700, Linus Torvalds wrote: > On Wednesday, April 13, 2011, Linus Torvalds > wrote: > > On Wednesday, April 13, 2011, H. Peter Anvin wrote: > >> > >> Yes. However, even if we *do* revert (and the time is running short on > >> not reverting) I would like to understand this particular one, simply > >> because I think it may very well be a problem that is manifesting itself > >> in other ways on other systems. > > sorry, fingerfart. Anyway, I agree 100%. > > we definitely want to also understand the reason for things not > working, even if we do revert.. There were (and still are) places where memblock callers implemented ad-hoc top-down allocation by stepping down start limit until allocation succeeds. Several of them have been removed since top-down became the default behavior, so simply reverting the commit is likely to cause subtle issues. Maybe the best approach is introducing @topdown parameter and use it selectively for pure memory allocations. Thanks. -- tejun ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 28627] 2.6.31.6 is the last kernel where KMS works well on an RV515 card for regular PCI
https://bugs.freedesktop.org/show_bug.cgi?id=28627 --- Comment #21 from Connor Behan 2011-04-13 21:40:10 PDT --- This bug largely goes away if I use kernels 2.6.37 and 2.6.38 with the Gallium Radeon/DRI driver. In fact the glxgears framerates I get that way are slightly better. Some things to note are that the framerates become awful again if I turn "EXAPixmaps" "off" and that I still have trouble logging out of X. This is surely a topic for another bug. Thanks for all the work you've been doing! -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 35312] r600g: Automatic mipmap generation doesn't work properly
https://bugs.freedesktop.org/show_bug.cgi?id=35312 --- Comment #1 from Francis Whittle 2011-04-13 22:36:09 PDT --- Created an attachment (id=45598) View: https://bugs.freedesktop.org/attachment.cgi?id=45598 Review: https://bugs.freedesktop.org/review?bug=35312&attachment=45598 short patch to test problem Can you try this patch to mesa and say if it fixes the issue? -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Linux 2.6.39-rc3
On 04/13/2011 07:07 PM, Dave Airlie wrote: >> >> Okay, staring at this, it definitely seems toxic to overlay the GART >> over memory areas reserved by the BIOS. If I were to guess, I would say >> that the problem here seems to be that the kernel thinks it is >> overlaying 64 MiB of memory, but the actual GART is in fact 512 MiB in >> size -- 131072 CPU pages -- which now overlaps the BIOS reserved areas. >> >> Alex D., could you comment on the "num cpu pages" bit? > > These are not CPU addresses. I think we've stated that already. Not the > droids. > > the num cpu pages is how many CPU pages would be needed to fill the GPU > GTT, for those crazy cases where CPU pagesize != GPU pagesize. > OK, well, something is still weird. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH] drm/i915: restore only the mode of this driver on lastclose (v2)
From: Dave Airlie i915 calls the panic handler function on last close to reset the modes, however this is a really bad idea for multi-gpu machines, esp shareable gpus machines. So add a new entry point for the driver to just restore its own fbcon mode. v2: move code into fb helper, fix panic code to block mode change on powered off GPUs. Signed-off-by: Dave Airlie --- drivers/gpu/drm/drm_fb_helper.c | 27 --- drivers/gpu/drm/i915/i915_dma.c |2 +- drivers/gpu/drm/i915/intel_drv.h |1 + drivers/gpu/drm/i915/intel_fb.c | 10 ++ include/drm/drm_fb_helper.h |1 + 5 files changed, 33 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c index 9507204..11d7a72 100644 --- a/drivers/gpu/drm/drm_fb_helper.c +++ b/drivers/gpu/drm/drm_fb_helper.c @@ -342,9 +342,22 @@ int drm_fb_helper_debug_leave(struct fb_info *info) } EXPORT_SYMBOL(drm_fb_helper_debug_leave); +bool drm_fb_helper_restore_fbdev_mode(struct drm_fb_helper *fb_helper) +{ + bool error = false; + int i, ret; + for (i = 0; i < fb_helper->crtc_count; i++) { + struct drm_mode_set *mode_set = &fb_helper->crtc_info[i].mode_set; + ret = drm_crtc_helper_set_config(mode_set); + if (ret) + error = true; + } + return error; +} +EXPORT_SYMBOL(drm_fb_helper_restore_fbdev_mode); + bool drm_fb_helper_force_kernel_mode(void) { - int i = 0; bool ret, error = false; struct drm_fb_helper *helper; @@ -352,12 +365,12 @@ bool drm_fb_helper_force_kernel_mode(void) return false; list_for_each_entry(helper, &kernel_fb_helper_list, kernel_fb_list) { - for (i = 0; i < helper->crtc_count; i++) { - struct drm_mode_set *mode_set = &helper->crtc_info[i].mode_set; - ret = drm_crtc_helper_set_config(mode_set); - if (ret) - error = true; - } + if (helper->dev->switch_power_state == DRM_SWITCH_POWER_OFF) + continue; + + ret = drm_fb_helper_restore_fbdev_mode(helper); + if (ret) + error = true; } return error; } diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index 7273037..12876f2 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -2207,7 +2207,7 @@ void i915_driver_lastclose(struct drm_device * dev) drm_i915_private_t *dev_priv = dev->dev_private; if (!dev_priv || drm_core_check_feature(dev, DRIVER_MODESET)) { - drm_fb_helper_restore(); + intel_fb_restore_mode(dev); vga_switcheroo_process_delayed_switch(); return; } diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h index f5b0d83..1d20712 100644 --- a/drivers/gpu/drm/i915/intel_drv.h +++ b/drivers/gpu/drm/i915/intel_drv.h @@ -338,4 +338,5 @@ extern int intel_overlay_attrs(struct drm_device *dev, void *data, struct drm_file *file_priv); extern void intel_fb_output_poll_changed(struct drm_device *dev); +extern void intel_fb_restore_mode(struct drm_device *dev); #endif /* __INTEL_DRV_H__ */ diff --git a/drivers/gpu/drm/i915/intel_fb.c b/drivers/gpu/drm/i915/intel_fb.c index 5127827..ec49bae 100644 --- a/drivers/gpu/drm/i915/intel_fb.c +++ b/drivers/gpu/drm/i915/intel_fb.c @@ -264,3 +264,13 @@ void intel_fb_output_poll_changed(struct drm_device *dev) drm_i915_private_t *dev_priv = dev->dev_private; drm_fb_helper_hotplug_event(&dev_priv->fbdev->helper); } + +void intel_fb_restore_mode(struct drm_device *dev) +{ + int ret; + drm_i915_private_t *dev_priv = dev->dev_private; + + ret = drm_fb_helper_restore_fbdev_mode(&dev_priv->fbdev->helper); + if (ret) + DRM_DEBUG("failed to restore crtc mode\n"); +} diff --git a/include/drm/drm_fb_helper.h b/include/drm/drm_fb_helper.h index f22e7fe..ade09d7 100644 --- a/include/drm/drm_fb_helper.h +++ b/include/drm/drm_fb_helper.h @@ -118,6 +118,7 @@ int drm_fb_helper_setcolreg(unsigned regno, unsigned transp, struct fb_info *info); +bool drm_fb_helper_restore_fbdev_mode(struct drm_fb_helper *fb_helper); void drm_fb_helper_restore(void); void drm_fb_helper_fill_var(struct fb_info *info, struct drm_fb_helper *fb_helper, uint32_t fb_width, uint32_t fb_height); -- 1.7.1
[git pull] drm fixes
Hi Linus, This should have gone out a few days ago, but I was trapped watching Disney shows with my daughter at home and I wanted to check it on a few more machines, Its got two reverts, one for a change I pushed out by accident to -fixes, the other for a Xen/TTM change, that looks to be causing non-Xen problems so punting on it for now. The rest is mostly nouveau + radeon fixes, the radeon ones fix a few regressions and stability problems on newer cards. I suspect I'll have a few more intel fixes and v2 of the i915 patch I reverted out of this pull, it fixes a problem on the dual-gpu laptops reported a long while ago. The following changes since commit 94c8a984ae2adbd9a9626fb42e0f2faf3e36e86f: Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6 (2011-04-08 11:47:35 -0700) are available in the git repository at: ssh://master.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6.git drm-fixes Alex Deucher (7): drm/radeon/kms: pll tweaks for rv6xx drm/radeon/kms: make radeon i2c put/get bytes less noisy drm/radeon/kms: clean up gart dummy page handling drm/radeon/kms: fix suspend on rv530 asics drm/radeon/kms: fix pcie_p callbacks on btc and cayman drm/radeon/kms: add voltage type to atom set voltage function drm/radeon/kms: properly program vddci on evergreen+ Ben Skeggs (5): drm/nouveau: implement init table opcode 0x5c drm/nouveau: quirk for XFX GT-240X-YA drm/nv50: use "nv86" tlb flush method on everything except 0x50/0xac drm/nv50-nvc0: remove some code that doesn't belong here drm/nvc0: improve vm flush function Dave Airlie (4): i915: restore only the mode of this driver on lastclose Merge remote branch 'nouveau/drm-nouveau-fixes' of /ssd/git/drm-nouveau-next into drm-fixes Revert "ttm: Utilize the DMA API for pages that have TTM_PAGE_FLAG_DMA32 set." Revert "i915: restore only the mode of this driver on lastclose" David Dillow (1): drm/nv50-nvc0: work around an evo channel hang that some people see Emil Velikov (1): nv30: Fix parsing of perf table Konstantin Khlebnikov (1): i915: select VIDEO_OUTPUT_CONTROL for ACPI_VIDEO Marcin Slusarz (1): drm/nouveau: fix oops on unload with disabled LVDS panel Michel D?nzer (2): radeon: Fix KMS CP writeback on big endian machines. drm/radeon: Fix KMS legacy backlight support if CONFIG_BACKLIGHT_CLASS_DEVICE=m. Roy Spliet (1): drm/nouveau: correct memtiming table parsing for nv4x drivers/gpu/drm/Kconfig |1 + drivers/gpu/drm/nouveau/nouveau_bios.c | 53 +++- drivers/gpu/drm/nouveau/nouveau_drv.h |2 +- drivers/gpu/drm/nouveau/nouveau_mem.c | 76 +++ drivers/gpu/drm/nouveau/nouveau_perf.c |2 +- drivers/gpu/drm/nouveau/nouveau_state.c | 12 +--- drivers/gpu/drm/nouveau/nv04_dfp.c | 13 ++-- drivers/gpu/drm/nouveau/nv50_crtc.c |3 - drivers/gpu/drm/nouveau/nv50_evo.c |1 + drivers/gpu/drm/nouveau/nv50_graph.c|2 +- drivers/gpu/drm/nouveau/nvc0_vm.c | 24 +--- drivers/gpu/drm/radeon/atom.c |6 ++- drivers/gpu/drm/radeon/atombios_crtc.c |6 ++ drivers/gpu/drm/radeon/evergreen.c | 17 +++--- drivers/gpu/drm/radeon/r600.c |6 +-- drivers/gpu/drm/radeon/radeon.h | 12 +++- drivers/gpu/drm/radeon/radeon_asic.c|2 +- drivers/gpu/drm/radeon/radeon_atombios.c| 30 ++--- drivers/gpu/drm/radeon/radeon_fence.c |2 +- drivers/gpu/drm/radeon/radeon_gart.c|2 + drivers/gpu/drm/radeon/radeon_i2c.c |4 +- drivers/gpu/drm/radeon/radeon_legacy_encoders.c |2 +- drivers/gpu/drm/radeon/radeon_pm.c | 11 +++- drivers/gpu/drm/radeon/radeon_ring.c|2 +- drivers/gpu/drm/radeon/rs600.c |2 +- drivers/gpu/drm/radeon/rv770.c |6 +-- drivers/gpu/drm/ttm/ttm_page_alloc.c| 26 +--- drivers/gpu/stub/Kconfig|1 + 28 files changed, 201 insertions(+), 125 deletions(-)
[PATCH] drm/i915: restore only the mode of this driver on lastclose (v2)
On Wed, 13 Apr 2011 09:35:55 +1000, Dave Airlie wrote: > From: Dave Airlie > > i915 calls the panic handler function on last close to reset the modes, > however this is a really bad idea for multi-gpu machines, esp shareable > gpus machines. So add a new entry point for the driver to just restore > its own fbcon mode. > > v2: move code into fb helper, fix panic code to block mode change on > powered off GPUs. 2 bugs in one patch? This could be split into 3 steps... ;-) Aside from that, looks good. -Chris -- Chris Wilson, Intel Open Source Technology Centre
Linux 2.6.39-rc3
* Joerg Roedel wrote: > > > The problem does not happen with 2.6.38. I try to bisect this further > > > down to a commit. Alex, please let me know if you need any further > > > information. > > > > If you can bisect it, that would be great. Thanks, > > Bisecting actually gave a very weird result. It points to > > d2137d5af4259f50c19addb8246a186c9ffac325 > > which is a merge-commit in the x86 tree. Even more weird is that this > notebook is the only machine with these symptoms, all my other boxes are > fine. > > During the bisect I tested commits from Yinghai which were good. It seems > like the problem appeared with the merge. There's a similar looking bug being debugged here: https://bugzilla.kernel.org/show_bug.cgi?id=33012 Could you please send the before/after bootlog (in particular all memory init messages included) and your .config? before: f005fe12b90c: x86-64: Move out cleanup higmap [_brk_end, _end) out of init_memory_mapping() after: d2137d5af425: Merge branch 'linus' into x86/bootmem I've Cc:-ed more people who might have an idea about it. Thanks, Ingo
[Bug 34534] resolution 3840x1024 stopped to work on HD5850 after switch to 2.6.37 kernel
https://bugs.freedesktop.org/show_bug.cgi?id=34534 --- Comment #15 from Peter Hercek 2011-04-13 00:37:56 PDT --- Created an attachment (id=45562) --> (https://bugs.freedesktop.org/attachment.cgi?id=45562) xrandr --verbose output on 2.6.38.2-vanilla (with 3840x1024 fixed using radeonreg regset 0x770c 0x00020004) -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug.
Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?
On Tue, Apr 12, 2011 at 07:29:22PM +0200, Michel D?nzer wrote: > On Die, 2011-04-12 at 14:00 +0200, Gabriel Paubert wrote: > > On Tue, Apr 12, 2011 at 01:46:10PM +0200, Michel D?nzer wrote: > > > > > > > > With no_wb=1 the driver goes a bit further but the X server ends > > > > up in an infinite ioctl loop and the logs are: > > > > > > Which ioctl does it loop on? Please provide the Xorg.0.log file as well. > > > > From memory, the code was 0x64, which is DRM_RADEON_GEM_WAIT_IDLE. > > Note that it's normal for this ioctl to be called every time before the > GPU accessible pixmap memory is accessed by the CPU. Unless the ioctl > always returns an error, this may not indicate a problem on its own. It seems to be an infinite loop, always returning EINTR because of regular SIGALRM delivery. > > > > The Xorg.0.log from the previous boot is attached. > > I don't see any obvious problems in it. Can you describe the symptoms of > the problem you're having with X a bit more? Well, X is dead, or rather in an infinite ioctl loop as described above. IIRC, the display enters a power-down mode and there is nothing to see. > > One thing I notice is that the X server/driver are rather oldish. Maybe > you can try newer versions from testing, sid or even experimental to see > if that makes any difference. I lack time to do it until early May (being away for 2 weeks starting on Friday and busy on urgent things). I'm indeed Debian stable (Squeeze), which is rather recent and the machine is about 2 1/2 years old. Gabriel
Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?
On Tue, Apr 12, 2011 at 01:46:10PM +0200, Michel D?nzer wrote: > BTW, if your kernel contains commit > 69a07f0b117a40fcc1a479358d8e1f41793617f2, can you try if reverting that > helps? My kernel is pristine 2.6.38 and does not include this commit (was introduced before 2.6.39-rc1 according to gitk). Gabriel
[Bug 34534] resolution 3840x1024 stopped to work on HD5850 after switch to 2.6.37 kernel
https://bugs.freedesktop.org/show_bug.cgi?id=34534 --- Comment #16 from Peter Hercek 2011-04-13 01:37:03 PDT --- (In reply to comment #14) > Does this patch help? No, the image stays corrupted, I still need to do this to fix it: # radeonreg regset 0x770c 0x00020004 OLD: 0x770c (770c)0x00010005 (65541) NEW: 0x770c (770c)0x00010004 (65540) # I applied and tested the patch with 2.6.38.2-vanilla. -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug.
Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?
On Wed, 2011-04-13 at 09:59 +0200, Gabriel Paubert wrote: > > Well, X is dead, or rather in an infinite ioctl loop as described > above. > IIRC, the display enters a power-down mode and there is nothing to > see. So basically the card crashed. There's about an infinite amount of reasons why radeons do so, sometimes it has to do with them not liking what you ate that day... The only thing I can see that could be of use would be a bisect Cheers, Ben.
Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?
On Wed, Apr 13, 2011 at 06:16:13PM +1000, Benjamin Herrenschmidt wrote: > On Wed, 2011-04-13 at 09:59 +0200, Gabriel Paubert wrote: > > > > Well, X is dead, or rather in an infinite ioctl loop as described > > above. > > IIRC, the display enters a power-down mode and there is nothing to > > see. > > So basically the card crashed. There's about an infinite amount of > reasons why radeons do so, sometimes it has to do with them not liking > what you ate that day... > > The only thing I can see that could be of use would be a bisect Bisecting for something which I have never got to work (radeon with KMS) on this machine is something I don't know how to do... Note that radeon without KMS also always ends up crashing, but it may take hours. The only case where the machine works reliably is when glxinfo claims that it is using software rendering. Regards, Gabriel
small git lesson [Was: Re: Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?]
On Wed, Apr 13, 2011 at 10:59:14AM +0200, Andreas Schwab wrote: > Uwe Kleine-K?nig writes: > > > $ git name-rev --refs=refs/tags/v2.6\* > > 69a07f0b117a40fcc1a479358d8e1f41793617f2 > > 69a07f0b117a40fcc1a479358d8e1f41793617f2 tags/v2.6.39-rc2~3^2~43^2~4 > > > > so it was introduced just before -rc2. > > $ git tag --contains 69a07f0b117a40fcc1a479358d8e1f41793617f2 > v2.6.39-rc1 > v2.6.39-rc2 > So who is right? I think it was before rc1. Anyway I'm aware that there are other git commands, although for the option details I often have to have a look at the man page. However in this case the main reason to fire gitk was to have a quick look at the patch and its context, and simply reported the "Precedes" line in the display, which is 2.6.39-rc1. It also follow v2.6.37-rc2, which means that it has been quite a long time outside the main tree. Gabriel
[Bug 35502] Regression: black screen with Radeon KMS in 2.6.38 (2.6.37.4 worked fine)
https://bugs.freedesktop.org/show_bug.cgi?id=35502 Michel D?nzer changed: What|Removed |Added CC||bryce at canonical.com --- Comment #9 from Michel D?nzer 2011-04-13 04:45:42 PDT --- *** Bug 36007 has been marked as a duplicate of this bug. *** -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug.
[PATCH] Big Endian support for RV730 (Mesa r600)
On Tue, 2011-04-12 at 10:01 +0200, C?dric Cano wrote: > Hi > > Here you are a patch that adds big endian support for rv730 in r600 > classic mesa driver. The BE modifications are almost the same as the DRM > / DDX driver modifications > (http://lists.freedesktop.org/archives/dri-devel/2011-February/008151.html). > > I used the mesa-demos to test the driver status on big endian platform. > Nearly all demos renders the same as on Intel architecture. > Nevertheless, there are still some issues in glReadPixels (r600_blit) > with some formats. I can't figure out exactly what and when data must be > swapped (set_tex_resoures, set_render_target...). Review of the patch > would be greatly appreciated. > > It seems that r600g will be the default for Mesa 7.11 so I'll try to > enable big endian support for Gallium now. Cool stuff ! I'll try to test that one of these days on various ppc's Cheers, Ben.
[PATCH] Big Endian support for RV730 (Mesa r600)
On Wed, 2011-04-13 at 22:05 +1000, Benjamin Herrenschmidt wrote: > On Tue, 2011-04-12 at 10:01 +0200, C?dric Cano wrote: > > Hi > > > > Here you are a patch that adds big endian support for rv730 in r600 > > classic mesa driver. The BE modifications are almost the same as the DRM > > / DDX driver modifications > > (http://lists.freedesktop.org/archives/dri-devel/2011-February/008151.html). > > > > I used the mesa-demos to test the driver status on big endian platform. > > Nearly all demos renders the same as on Intel architecture. > > Nevertheless, there are still some issues in glReadPixels (r600_blit) > > with some formats. I can't figure out exactly what and when data must be > > swapped (set_tex_resoures, set_render_target...). Review of the patch > > would be greatly appreciated. > > > > It seems that r600g will be the default for Mesa 7.11 so I'll try to > > enable big endian support for Gallium now. > > Cool stuff ! > > I'll try to test that one of these days on various ppc's BTW. I see you used some FSL embedded board. Do you have your PCIe MMIO space above 32-bit ? Last I looked, there was a bunch of fixing needing to be done, among others in the TTM, to make that work. I had some preliminary patches but they bitrot... mostly the issue is to make sure than a phys_addr_t is used instead of an unsigned long whenever it tries to store the physical address of an object. Ben.
Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?
On Mit, 2011-04-13 at 09:59 +0200, Gabriel Paubert wrote: > On Tue, Apr 12, 2011 at 07:29:22PM +0200, Michel D?nzer wrote: > > On Die, 2011-04-12 at 14:00 +0200, Gabriel Paubert wrote: > > > On Tue, Apr 12, 2011 at 01:46:10PM +0200, Michel D?nzer wrote: > > > > > > > > > > With no_wb=1 the driver goes a bit further but the X server ends > > > > > up in an infinite ioctl loop and the logs are: > > > > > > > > Which ioctl does it loop on? Please provide the Xorg.0.log file as well. > > > > > > From memory, the code was 0x64, which is DRM_RADEON_GEM_WAIT_IDLE. > > > > Note that it's normal for this ioctl to be called every time before the > > GPU accessible pixmap memory is accessed by the CPU. Unless the ioctl > > always returns an error, this may not indicate a problem on its own. > > It seems to be an infinite loop, always returning EINTR because > of regular SIGALRM delivery. That does sound like the GPU locks up. Do you get any messages in dmesg about lockups and attempts to reset the GPU at any time? -- Earthling Michel D?nzer |http://www.vmware.com Libre software enthusiast | Debian, X and DRI developer
Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?
On Wed, Apr 13, 2011 at 02:12:16PM +0200, Michel D?nzer wrote: > On Mit, 2011-04-13 at 09:59 +0200, Gabriel Paubert wrote: > > On Tue, Apr 12, 2011 at 07:29:22PM +0200, Michel D?nzer wrote: > > > On Die, 2011-04-12 at 14:00 +0200, Gabriel Paubert wrote: > > > > On Tue, Apr 12, 2011 at 01:46:10PM +0200, Michel D?nzer wrote: > > > > > > > > > > > > With no_wb=1 the driver goes a bit further but the X server ends > > > > > > up in an infinite ioctl loop and the logs are: > > > > > > > > > > Which ioctl does it loop on? Please provide the Xorg.0.log file as > > > > > well. > > > > > > > > From memory, the code was 0x64, which is DRM_RADEON_GEM_WAIT_IDLE. > > > > > > Note that it's normal for this ioctl to be called every time before the > > > GPU accessible pixmap memory is accessed by the CPU. Unless the ioctl > > > always returns an error, this may not indicate a problem on its own. > > > > It seems to be an infinite loop, always returning EINTR because > > of regular SIGALRM delivery. > > That does sound like the GPU locks up. Do you get any messages in dmesg > about lockups and attempts to reset the GPU at any time? No. Gabriel
Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?
On Mit, 2011-04-13 at 14:27 +0200, Gabriel Paubert wrote: > On Wed, Apr 13, 2011 at 02:12:16PM +0200, Michel D?nzer wrote: > > On Mit, 2011-04-13 at 09:59 +0200, Gabriel Paubert wrote: > > > On Tue, Apr 12, 2011 at 07:29:22PM +0200, Michel D?nzer wrote: > > > > On Die, 2011-04-12 at 14:00 +0200, Gabriel Paubert wrote: > > > > > On Tue, Apr 12, 2011 at 01:46:10PM +0200, Michel D?nzer wrote: > > > > > > > > > > > > > > With no_wb=1 the driver goes a bit further but the X server ends > > > > > > > up in an infinite ioctl loop and the logs are: > > > > > > > > > > > > Which ioctl does it loop on? Please provide the Xorg.0.log file as > > > > > > well. > > > > > > > > > > From memory, the code was 0x64, which is DRM_RADEON_GEM_WAIT_IDLE. > > > > > > > > Note that it's normal for this ioctl to be called every time before the > > > > GPU accessible pixmap memory is accessed by the CPU. Unless the ioctl > > > > always returns an error, this may not indicate a problem on its own. > > > > > > It seems to be an infinite loop, always returning EINTR because > > > of regular SIGALRM delivery. > > > > That does sound like the GPU locks up. Do you get any messages in dmesg > > about lockups and attempts to reset the GPU at any time? > > No. Hmm, I guess the constant SIGALRMs might prevent the lockup detection from kicking in... Maybe you can try starting the X server with -dumbSched to see if that gets things along any further, but in the end there's probably no way around figuring out what causes the lockup and fixing that anyway. -- Earthling Michel D?nzer |http://www.vmware.com Libre software enthusiast | Debian, X and DRI developer
[PATCH] drm/radeon/kms: fix suspend on rv530 asics
On Tue, Apr 12, 2011 at 1:33 PM, Alex Deucher wrote: > Apparently only rv515 asics need the workaround > added in f24d86f1a49505cdea56728b853a5d0a3f8e3d11 > (drm/radeon/kms: fix resume regression for some r5xx laptops). > > Fixes: > https://bugs.freedesktop.org/show_bug.cgi?id=34709 > > Signed-off-by: Alex Deucher > Cc: stable at kernel.org > --- > ?drivers/gpu/drm/radeon/atom.c | ? ?6 +- > ?1 files changed, 5 insertions(+), 1 deletions(-) > > diff --git a/drivers/gpu/drm/radeon/atom.c b/drivers/gpu/drm/radeon/atom.c > index 258fa5e..d71d375 100644 > --- a/drivers/gpu/drm/radeon/atom.c > +++ b/drivers/gpu/drm/radeon/atom.c > @@ -32,6 +32,7 @@ > ?#include "atom.h" > ?#include "atom-names.h" > ?#include "atom-bits.h" > +#include "radeon.h" > > ?#define ATOM_COND_ABOVE ? ? ? ? ? ? ? ?0 > ?#define ATOM_COND_ABOVEOREQUAL 1 > @@ -101,7 +102,9 @@ static void debug_print_spaces(int n) > ?static uint32_t atom_iio_execute(struct atom_context *ctx, int base, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? uint32_t index, uint32_t data) > ?{ > + ? ? ? struct radeon_device *rdev = ctx->card->dev->dev_private; > ? ? ? ?uint32_t temp = 0xCDCDCDCD; > + > ? ? ? ?while (1) > ? ? ? ? ? ? ? ?switch (CU8(base)) { > ? ? ? ? ? ? ? ?case ATOM_IIO_NOP: > @@ -112,7 +115,8 @@ static uint32_t atom_iio_execute(struct atom_context > *ctx, int base, > ? ? ? ? ? ? ? ? ? ? ? ?base += 3; > ? ? ? ? ? ? ? ? ? ? ? ?break; > ? ? ? ? ? ? ? ?case ATOM_IIO_WRITE: > - ? ? ? ? ? ? ? ? ? ? ? (void)ctx->card->ioreg_read(ctx->card, CU16(base + > 1)); > + ? ? ? ? ? ? ? ? ? ? ? if (rdev->family == CHIP_RV515) > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? (void)ctx->card->ioreg_read(ctx->card, > CU16(base + 1)); > ? ? ? ? ? ? ? ? ? ? ? ?ctx->card->ioreg_write(ctx->card, CU16(base + 1), > temp); > ? ? ? ? ? ? ? ? ? ? ? ?base += 3; > ? ? ? ? ? ? ? ? ? ? ? ?break; > -- > 1.7.1.1 > So this patch enable io write only for one family ? This looks utterly strange. Cheers, Jerome
[PATCH] drm/radeon/kms: fix suspend on rv530 asics
On Wed, Apr 13, 2011 at 10:46 AM, Jerome Glisse wrote: > On Tue, Apr 12, 2011 at 1:33 PM, Alex Deucher > wrote: >> Apparently only rv515 asics need the workaround >> added in f24d86f1a49505cdea56728b853a5d0a3f8e3d11 >> (drm/radeon/kms: fix resume regression for some r5xx laptops). >> >> Fixes: >> https://bugs.freedesktop.org/show_bug.cgi?id=34709 >> >> Signed-off-by: Alex Deucher >> Cc: stable at kernel.org >> --- >> ?drivers/gpu/drm/radeon/atom.c | ? ?6 +- >> ?1 files changed, 5 insertions(+), 1 deletions(-) >> >> diff --git a/drivers/gpu/drm/radeon/atom.c b/drivers/gpu/drm/radeon/atom.c >> index 258fa5e..d71d375 100644 >> --- a/drivers/gpu/drm/radeon/atom.c >> +++ b/drivers/gpu/drm/radeon/atom.c >> @@ -32,6 +32,7 @@ >> ?#include "atom.h" >> ?#include "atom-names.h" >> ?#include "atom-bits.h" >> +#include "radeon.h" >> >> ?#define ATOM_COND_ABOVE ? ? ? ? ? ? ? ?0 >> ?#define ATOM_COND_ABOVEOREQUAL 1 >> @@ -101,7 +102,9 @@ static void debug_print_spaces(int n) >> ?static uint32_t atom_iio_execute(struct atom_context *ctx, int base, >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? uint32_t index, uint32_t data) >> ?{ >> + ? ? ? struct radeon_device *rdev = ctx->card->dev->dev_private; >> ? ? ? ?uint32_t temp = 0xCDCDCDCD; >> + >> ? ? ? ?while (1) >> ? ? ? ? ? ? ? ?switch (CU8(base)) { >> ? ? ? ? ? ? ? ?case ATOM_IIO_NOP: >> @@ -112,7 +115,8 @@ static uint32_t atom_iio_execute(struct atom_context >> *ctx, int base, >> ? ? ? ? ? ? ? ? ? ? ? ?base += 3; >> ? ? ? ? ? ? ? ? ? ? ? ?break; >> ? ? ? ? ? ? ? ?case ATOM_IIO_WRITE: >> - ? ? ? ? ? ? ? ? ? ? ? (void)ctx->card->ioreg_read(ctx->card, CU16(base + >> 1)); >> + ? ? ? ? ? ? ? ? ? ? ? if (rdev->family == CHIP_RV515) >> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? (void)ctx->card->ioreg_read(ctx->card, >> CU16(base + 1)); >> ? ? ? ? ? ? ? ? ? ? ? ?ctx->card->ioreg_write(ctx->card, CU16(base + 1), >> temp); >> ? ? ? ? ? ? ? ? ? ? ? ?base += 3; >> ? ? ? ? ? ? ? ? ? ? ? ?break; >> -- >> 1.7.1.1 >> > > > So this patch enable io write only for one family ? This looks utterly > strange. No, it just does a read before write for rv515. I don't know why it needs it, but it seems to. Alex > > Cheers, > Jerome >
[Bug 25588] Lots of ARB_vertex_program/fragment_program parser errors in ETQW (if GLSL is unavailable)
https://bugs.freedesktop.org/show_bug.cgi?id=25588 Fabio Pedretti changed: What|Removed |Added Resolution|WORKSFORME |WONTFIX Component|Mesa core |Drivers/DRI/r300 AssignedTo|mesa-dev at lists.freedesktop. |dri-devel at lists.freedesktop |org |.org -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug.
[Bug 33222] New: [RADEON] Oops in worker thread for radeon_unpin_work_func
https://bugzilla.kernel.org/show_bug.cgi?id=33222 Summary: [RADEON] Oops in worker thread for radeon_unpin_work_func Product: Drivers Version: 2.5 Kernel Version: 2.6.38.2 Platform: All OS/Version: Linux Tree: Mainline Status: NEW Severity: low Priority: P1 Component: Video(DRI - non Intel) AssignedTo: drivers_video-dri at kernel-bugs.osdl.org ReportedBy: thomas at m3y3r.de Regression: No Created an attachment (id=54282) --> (https://bugzilla.kernel.org/attachment.cgi?id=54282) Oops - Part 1 Few days ago I stumbled upon the attached oops. Just images. sorry for that. This is the first time I saw this oops. I just hit it once for 2.6.38. -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are watching the assignee of the bug. -- Forrester Wave Report - Recovery time is now measured in hours and minutes not days. Key insights are discussed in the 2010 Forrester Wave Report as part of an in-depth evaluation of disaster recovery service providers. Forrester found the best-in-class provider in terms of services and vision. Read this report now! http://p.sf.net/sfu/ibm-webcastpromo -- ___ Dri-devel mailing list Dri-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[Bug 33222] [RADEON] Oops in worker thread for radeon_unpin_work_func
https://bugzilla.kernel.org/show_bug.cgi?id=33222 --- Comment #1 from Thomas Meyer 2011-04-13 17:09:48 --- Created an attachment (id=54292) --> (https://bugzilla.kernel.org/attachment.cgi?id=54292) Oops - Part 2 -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are watching the assignee of the bug. -- Forrester Wave Report - Recovery time is now measured in hours and minutes not days. Key insights are discussed in the 2010 Forrester Wave Report as part of an in-depth evaluation of disaster recovery service providers. Forrester found the best-in-class provider in terms of services and vision. Read this report now! http://p.sf.net/sfu/ibm-webcastpromo -- ___ Dri-devel mailing list Dri-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
small git lesson [Was: Re: Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?]
On Wed, Apr 13, 2011 at 10:02:04AM +0200, Gabriel Paubert wrote: > On Tue, Apr 12, 2011 at 01:46:10PM +0200, Michel D?nzer wrote: > > BTW, if your kernel contains commit > > 69a07f0b117a40fcc1a479358d8e1f41793617f2, can you try if reverting that > > helps? > > My kernel is pristine 2.6.38 and does not include this commit > (was introduced before 2.6.39-rc1 according to gitk). gitk is not the best tool to find this out. $ git name-rev --refs=refs/tags/v2.6\* 69a07f0b117a40fcc1a479358d8e1f41793617f2 69a07f0b117a40fcc1a479358d8e1f41793617f2 tags/v2.6.39-rc2~3^2~43^2~4 so it was introduced just before -rc2. Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-K?nig| Industrial Linux Solutions | http://www.pengutronix.de/ |
small git lesson [Was: Re: Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?]
Uwe Kleine-K?nig writes: > $ git name-rev --refs=refs/tags/v2.6\* > 69a07f0b117a40fcc1a479358d8e1f41793617f2 > 69a07f0b117a40fcc1a479358d8e1f41793617f2 tags/v2.6.39-rc2~3^2~43^2~4 > > so it was introduced just before -rc2. $ git tag --contains 69a07f0b117a40fcc1a479358d8e1f41793617f2 v2.6.39-rc1 v2.6.39-rc2 Andreas. -- Andreas Schwab, schwab at redhat.com GPG Key fingerprint = D4E8 DBE3 3813 BB5D FA84 5EC7 45C6 250E 6F00 984E "And now for something completely different."
small git lesson [Was: Re: Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?]
Hello Gabriel On Wed, Apr 13, 2011 at 12:31:44PM +0200, Gabriel Paubert wrote: > On Wed, Apr 13, 2011 at 10:59:14AM +0200, Andreas Schwab wrote: > > Uwe Kleine-K?nig writes: > > > > > $ git name-rev --refs=refs/tags/v2.6\* > > > 69a07f0b117a40fcc1a479358d8e1f41793617f2 > > > 69a07f0b117a40fcc1a479358d8e1f41793617f2 tags/v2.6.39-rc2~3^2~43^2~4 > > > > > > so it was introduced just before -rc2. > > > > $ git tag --contains 69a07f0b117a40fcc1a479358d8e1f41793617f2 > > v2.6.39-rc1 > > v2.6.39-rc2 > > > > So who is right? I think it was before rc1. Yep, correct. I interpreted the output of git name-rev to mean it's not included in a tag earlier than v2.6.39-rc2, but actually that's wrong. It's just that it's easier (for some definition of easy) to reach the commit in question from v2.6.39-rc2 than from v2.6.39-rc1. > However in this case the main reason to fire gitk was to have a quick look > at the patch and its context, and simply reported the "Precedes" line > in the display, which is 2.6.39-rc1. It also follow v2.6.37-rc2, which means > that it has been quite a long time outside the main tree. I think this conclusion isn't valid in general. (E.g. in git itself a bug-fix is often done on top of the commit that introduced it and than merged into master. Still the bugfix might be new.) But looking at the AuthorDate of 69a07f0b117a seems to support your statement. Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-K?nig| Industrial Linux Solutions | http://www.pengutronix.de/ |
[Bug 33222] [RADEON] Oops in worker thread for radeon_unpin_work_func
https://bugzilla.kernel.org/show_bug.cgi?id=33222 Alex Deucher changed: What|Removed |Added CC||alexdeucher at gmail.com --- Comment #2 from Alex Deucher 2011-04-13 17:19:06 --- This is a duplicate of bug 32402. -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are watching the assignee of the bug. -- Forrester Wave Report - Recovery time is now measured in hours and minutes not days. Key insights are discussed in the 2010 Forrester Wave Report as part of an in-depth evaluation of disaster recovery service providers. Forrester found the best-in-class provider in terms of services and vision. Read this report now! http://p.sf.net/sfu/ibm-webcastpromo -- ___ Dri-devel mailing list Dri-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Linux 2.6.39-rc3
On Wed, Apr 13, 2011 at 08:46:09AM +0200, Ingo Molnar wrote: > Could you please send the before/after bootlog (in particular all memory init > messages included) and your .config? > > before: f005fe12b90c: x86-64: Move out cleanup higmap [_brk_end, _end) out > of init_memory_mapping() > after: d2137d5af425: Merge branch 'linus' into x86/bootmem > > I've Cc:-ed more people who might have an idea about it. Okay, I have done some more bisecting and debugging today. First of all, I bisected between v2.6.37-rc2..f005fe12b90c which where only a couple of patches and merged v2.6.38-rc4 in at every step. There was no failure found. Then I tried this again, but this time I merged v2.6.38-rc5 at every step and was successful. The bad commit in this branch turned out to be 1a4a678b12c84db9ae5dce424e0e97f0559bb57c which is related to memblock. Then I tried to find out which change between 2.6.38-rc4 and 2.6.38-rc5 is needed to trigger the failure, so I used f005fe12b90c as a base, bisected between v2.6.38-rc4..v2.6.38-rc5 and merged every bisect step into the base and tested. Here the bad commit turned out to be e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20 which is related to gart. It turned out that the gart aperture on that box is on another position with these patches. Before it was as 0xa400 and now it is at 0xa000. It seems like this has something to do with the root-cause. Reverting commit 1a4a678b12c84db9ae5dce424e0e97f0559bb57c fixes the problem btw. and booting with iommu=soft also works, but I have no idea yet why the aperture at that address is a problem (with the patch reverted the aperture lands at 0x8000). I have put some debug-data online. There is my .config and two dmesg-files for good (==2.6.39-rc3 + revert) and bad (==2.6.39-rc3) I also created these dmesg-files again with memblock=debug, maybe that helps to find the problem. The files are at http://www.8bytes.org/~joro/debug/ Or someone else has an idea about the issue... Joerg
[Bug 30651] [RADEON:KMS:R600G] gl output in mplayer have no colors if used with a fragment program with additional lookup and bicubic B-spline filtering
https://bugs.freedesktop.org/show_bug.cgi?id=30651 --- Comment #7 from Andy Furniss 2011-04-13 10:23:46 PDT --- (In reply to comment #6) > 1) yuv=4 on r600g still have no colours even though with r300g they are ok yuv=4 with or without bicubic now works for me on 600g > 2) still there is an overbright glitch in some white places in some videos > with > yuv=6. but again, it may be a mplayer bug since it present with r300g too (but > not software rasterizer), i'm not sure. This is still the same. One general observation is the with 600g perf is poor compared to 600c or xv, which are at least 2x faster when benchmarking with HD streams. -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug.
[Bug 30651] [RADEON:KMS:R600G] gl output in mplayer have no colors if used with a fragment program with additional lookup and bicubic B-spline filtering
https://bugs.freedesktop.org/show_bug.cgi?id=30651 --- Comment #8 from Sergey Kondakov 2011-04-13 10:46:29 PDT --- same here. and i never got answer about which method is better with amd/ati card and open stack now. i hope devs are looking into that stuff. -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug.
[Bug 30651] [RADEON:KMS:R600G] gl output in mplayer have no colors if used with a fragment program with additional lookup and bicubic B-spline filtering
https://bugs.freedesktop.org/show_bug.cgi?id=30651 --- Comment #9 from Andy Furniss 2011-04-13 11:38:47 PDT --- (In reply to comment #8) > same here. > and i never got answer about which method is better with amd/ati card and open > stack now. i hope devs are looking into that stuff. Maybe there isn't an answer as such for that question. I guess someone with an on-board low spec GPU may be more limited than a high end card with fast vram. Quality wise - I can't see any difference, the higher yuv= numbers give more features like gamma correction (not sure how to use it though). It would be nice if 600g could beat or equal 600c - it does for 3D, but for some reason not this. I said classic was twice as fast - it's actually more than that if I discount time taken by the codec. -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug.
[Bug 32982] Kernel locks up a few minutes after boot
https://bugzilla.kernel.org/show_bug.cgi?id=32982 --- Comment #6 from Bart Van Assche 2011-04-13 18:49:13 --- Although I'm still busy bisecting, I'd like to report that I got the following hung task report with head b73a21fc66fee35b41db755abebfacba48b2fc76 (had already seen something similar before with 2.6.39-rc2): INFO: task kjournald:918 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kjournald D 880131b9ddb8 0 918 2 0x 880131b9dd20 0046 880131b9dca0 8108cd6d 0282 880131b9dfd8 880137729f40 880131b9dfd8 880131b9c000 880131b9c000 880131b9c000 880131b9dfd8 Call Trace: [] ? trace_hardirqs_on_caller+0x14d/0x190 [] ? sub_preempt_count+0xa9/0xe0 [] journal_commit_transaction+0x13e/0x1590 [jbd] [] ? _raw_spin_unlock_irqrestore+0x65/0x80 [] ? sub_preempt_count+0xa9/0xe0 [] ? wake_up_bit+0x40/0x40 [] ? del_timer_sync+0x8a/0xc0 [] ? try_to_del_timer_sync+0x110/0x110 [] kjournald+0xf1/0x250 [jbd] [] ? wake_up_bit+0x40/0x40 [] ? commit_timeout+0x10/0x10 [jbd] [] kthread+0x96/0xa0 [] kernel_thread_helper+0x4/0x10 [] ? finish_task_switch+0x7b/0xe0 [] ? _raw_spin_unlock_irq+0x3b/0x60 [] ? retint_restore_args+0xe/0xe [] ? __init_kthread_worker+0x70/0x70 [] ? gs_change+0xb/0xb no locks held by kjournald/918. INFO: task klauncher:5744 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. klauncher D 0001000297b4 0 5744 5743 0x 88011dd73938 0046 8801 8108cbef 813e2535 88011dd73fd8 8801382e1f40 88011dd73fd8 88011dd72000 88011dd72000 88011dd72000 88011dd73fd8 Call Trace: [] ? mark_held_locks+0x6f/0xa0 [] ? _raw_spin_unlock_irqrestore+0x65/0x80 [] ? __wait_on_buffer+0x30/0x30 [] io_schedule+0x59/0x80 [] sleep_on_buffer+0xe/0x20 [] __wait_on_bit_lock+0x5a/0xc0 [] ? __wait_on_buffer+0x30/0x30 [] out_of_line_wait_on_bit_lock+0x78/0x90 [] ? autoremove_wake_function+0x50/0x50 [] __lock_buffer+0x36/0x40 [] do_get_write_access+0x64d/0x660 [jbd] [] ? sub_preempt_count+0xa9/0xe0 [] ? start_this_handle+0x370/0x470 [jbd] [] ? journal_add_journal_head+0xf4/0x220 [jbd] [] journal_get_write_access+0x31/0x50 [jbd] [] __ext3_journal_get_write_access+0x2d/0x60 [ext3] [] ext3_reserve_inode_write+0x83/0xb0 [ext3] [] ext3_mark_inode_dirty+0x44/0x70 [ext3] [] ext3_dirty_inode+0x5e/0xa0 [ext3] [] __mark_inode_dirty+0x3f/0x250 [] file_update_time+0xec/0x170 [] ? mutex_lock_nested+0x27d/0x3a0 [] __generic_file_aio_write+0x1f8/0x440 [] generic_file_aio_write+0x75/0xf0 [] do_sync_write+0xda/0x120 [] ? remove_vma+0x77/0x90 [] ? trace_hardirqs_on+0xd/0x10 [] ? remove_vma+0x77/0x90 [] vfs_write+0xc6/0x170 [] sys_write+0x51/0x90 [] system_call_fastpath+0x16/0x1b 2 locks held by klauncher/5744: #0: (&sb->s_type->i_mutex_key#11){+.+.+.}, at: [] generic_file_aio_write+0x59/0xf0 #1: (jbd_handle){+.+...}, at: [] start_this_handle+0x370/0x470 [jbd] INFO: task okular:4180 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. okular D 00010002a251 0 4180 5743 0x 880041d13aa8 0046 8800 8108cd6d 0282 880041d13fd8 880037a59f40 880041d13fd8 880041d12000 880041d12000 880041d12000 880041d13fd8 Call Trace: [] ? trace_hardirqs_on_caller+0x14d/0x190 [] start_this_handle+0x244/0x470 [jbd] [] ? is_module_address+0x33/0x60 [] ? wake_up_bit+0x40/0x40 [] journal_start+0xdb/0x120 [jbd] [] ext3_journal_start_sb+0x36/0x70 [ext3] [] ext3_setattr+0x1a3/0x210 [ext3] [] notify_change+0x116/0x360 [] do_truncate+0x63/0x90 [] ? sub_preempt_count+0xa9/0xe0 [] do_last+0x42c/0x820 [] path_openat+0xd0/0x410 [] ? might_fault+0x53/0xb0 [] do_filp_open+0x7f/0xa0 [] ? sub_preempt_count+0xa9/0xe0 [] ? _raw_spin_unlock+0x35/0x60 [] ? alloc_fd+0xf4/0x150 [] do_sys_open+0x101/0x1e0 [] sys_open+0x20/0x30 [] system_call_fastpath+0x16/0x1b 2 locks held by okular/4180: #0: (&sb->s_type->i_mutex_key#11){+.+.+.}, at: [] do_truncate+0x57/0x90 #1: (&sb->s_type->i_alloc_sem_key#4){+.+...}, at: [] notify_change+0x2a0/0x360 -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are watching the assignee of the bug. -- Forrester Wave Report - Recovery time is now measured in hours and minutes not days. Key insights are discussed in the 2010 Forrester Wave Report as part of an in-depth evaluation of disaster recovery service providers. Forrester found the best-in-class provider in terms of services and vision. Read this report now! http://p.sf.net/sfu/ibm-webcastpromo -- __
Linux 2.6.39-rc3
On 04/13/2011 10:21 AM, Joerg Roedel wrote: > On Wed, Apr 13, 2011 at 08:46:09AM +0200, Ingo Molnar wrote: > First of all, I bisected between v2.6.37-rc2..f005fe12b90c which where > only a couple of patches and merged v2.6.38-rc4 in at every step. There > was no failure found. > Then I tried this again, but this time I merged v2.6.38-rc5 at every > step and was successful. The bad commit in this branch turned out to be > > 1a4a678b12c84db9ae5dce424e0e97f0559bb57c > > which is related to memblock. > > Then I tried to find out which change between 2.6.38-rc4 and 2.6.38-rc5 > is needed to trigger the failure, so I used f005fe12b90c as a base, > bisected between v2.6.38-rc4..v2.6.38-rc5 and merged every bisect step > into the base and tested. Here the bad commit turned out to be > > e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20 > > which is related to gart. It turned out that the gart aperture on that > box is on another position with these patches. Before it was as > 0xa400 and now it is at 0xa000. It seems like this has something > to do with the root-cause. > > Reverting commit 1a4a678b12c84db9ae5dce424e0e97f0559bb57c fixes the > problem btw. and booting with iommu=soft also works, but I have no idea > yet why the aperture at that address is a problem (with the patch > reverted the aperture lands at 0x8000). > > I have put some debug-data online. There is my .config and two > dmesg-files for good (==2.6.39-rc3 + revert) and bad (==2.6.39-rc3) > I also created these dmesg-files again with memblock=debug, maybe that > helps to find the problem. The files are at > > http://www.8bytes.org/~joro/debug/ thanks for the bisecting... so those two patches uncover some problems. [0.00] Checking aperture... [0.00] No AGP bridge found [0.00] Node 0: aperture @ a000 size 32 MB [0.00] Aperture pointing to e820 RAM. Ignoring. [0.00] Your BIOS doesn't leave a aperture memory hole [0.00] Please enable the IOMMU option in the BIOS setup [0.00] This costs you 64 MB of RAM [0.00] memblock_x86_reserve_range: [0xa000-0xa3ff] aperture64 [0.00] Mapping aperture over 65536 KB of RAM @ a000 so kernel try to reallocate apperture. because BIOS allocated is pointed to RAM or size is too small. but your radeon does use [0xa000, 0xbfff) [4.281993] radeon :01:05.0: VRAM: 320M 0xC000 - 0xD3FF (320M used) [4.290672] radeon :01:05.0: GTT: 512M 0xA000 - 0xBFFF [4.298550] [drm] Detected VRAM RAM=320M, BAR=256M [4.309857] [drm] RAM width 32bits DDR [4.313748] [TTM] Zone kernel: Available graphics memory: 1896524 kiB. [4.320379] [TTM] Initializing pool allocator. [4.324948] [drm] radeon: 320M of VRAM memory ready [4.329832] [drm] radeon: 512M of GTT memory ready. and the one seems working: [0.00] Checking aperture... [0.00] No AGP bridge found [0.00] Node 0: aperture @ a000 size 32 MB [0.00] Aperture pointing to e820 RAM. Ignoring. [0.00] Your BIOS doesn't leave a aperture memory hole [0.00] Please enable the IOMMU option in the BIOS setup [0.00] This costs you 64 MB of RAM [0.00] memblock_x86_reserve_range: [0x8000-0x83ff] aperture64 [0.00] Mapping aperture over 65536 KB of RAM @ 8000 [0.00] memblock_x86_reserve_range: [0xacb6bdc0-0xacb6bddf] BOOTMEM will use different position... [4.250159] radeon :01:05.0: VRAM: 320M 0xC000 - 0xD3FF (320M used) [4.258830] radeon :01:05.0: GTT: 512M 0xA000 - 0xBFFF [4.266742] [drm] Detected VRAM RAM=320M, BAR=256M [4.271549] [drm] RAM width 32bits DDR [4.275435] [TTM] Zone kernel: Available graphics memory: 1896526 kiB. [4.282066] [TTM] Initializing pool allocator. [4.282085] usb 7-2: new full speed USB device number 2 using ohci_hcd [4.293076] [drm] radeon: 320M of VRAM memory ready [4.298277] [drm] radeon: 512M of GTT memory ready. [4.303218] [drm] Supports vblank timestamp caching Rev 1 (10.10.2010). [4.309854] [drm] Driver supports precise vblank timestamp query. [4.315970] [drm] radeon: irq initialized. [4.320094] [drm] GART: num cpu pages 131072, num gpu pages 131072 So question is why radeon is using the address [0xa000 - 0xc00], and in E820 it is RAM [0.00] BIOS-e820: 0010 - acb8d000 (usable) [0.00] BIOS-e820: acb8d000 - acb8f000 (reserved) [0.00] BIOS-e820: acb8f000 - afce9000 (usable) [0.00] BIOS-e820: afce9000 - afd21000 (reserved) [0.00] BIOS-e820: afd21000 - afd4f000 (usable) [0.00] BIOS-e820: afd4f000 - afdcf000 (reserved) [0.00] BIOS-e820: afdcf000
Linux 2.6.39-rc3
On 04/13/2011 10:21 AM, Joerg Roedel wrote: > > First of all, I bisected between v2.6.37-rc2..f005fe12b90c which where > only a couple of patches and merged v2.6.38-rc4 in at every step. There > was no failure found. > Then I tried this again, but this time I merged v2.6.38-rc5 at every > step and was successful. The bad commit in this branch turned out to be > > 1a4a678b12c84db9ae5dce424e0e97f0559bb57c > > which is related to memblock. > > Then I tried to find out which change between 2.6.38-rc4 and 2.6.38-rc5 > is needed to trigger the failure, so I used f005fe12b90c as a base, > bisected between v2.6.38-rc4..v2.6.38-rc5 and merged every bisect step > into the base and tested. Here the bad commit turned out to be > > e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20 > > which is related to gart. It turned out that the gart aperture on that > box is on another position with these patches. Before it was as > 0xa400 and now it is at 0xa000. It seems like this has something > to do with the root-cause. > > Reverting commit 1a4a678b12c84db9ae5dce424e0e97f0559bb57c fixes the > problem btw. and booting with iommu=soft also works, but I have no idea > yet why the aperture at that address is a problem (with the patch > reverted the aperture lands at 0x8000). > Does reverting e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20 solve the problem for you? 1a4a678b12c84db9ae5dce424e0e97f0559bb57c is a memory-allocation-order patch, which have a nasty tendency to unmask bugs elsewhere in the kernel. However, e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20 looks positively strange (and it doesn't exactly help that the description is written in Yinghai-ese and is therefore nearly impossible to decode, never mind tell if it is remotely correct.) -hpa
Linux 2.6.39-rc3
On 04/13/2011 10:21 AM, Joerg Roedel wrote: > On Wed, Apr 13, 2011 at 08:46:09AM +0200, Ingo Molnar wrote: >> Could you please send the before/after bootlog (in particular all memory >> init >> messages included) and your .config? >> >> before: f005fe12b90c: x86-64: Move out cleanup higmap [_brk_end, _end) out >> of init_memory_mapping() >> after: d2137d5af425: Merge branch 'linus' into x86/bootmem >> >> I've Cc:-ed more people who might have an idea about it. > > Okay, I have done some more bisecting and debugging today. > First of all, *huge* thanks for this effort. At least we need to track down the bits that need to be reverted -- it is past rc3, and it's time to see what we should revert and tell the submitter to try again next cycle. This looks to be the same issue as in bugzilla 33012: https://bugzilla.kernel.org/show_bug.cgi?id=33012 ... so it would be good if we could keep the information in there. -hpa
Linux 2.6.39-rc3
On Wed, Apr 13, 2011 at 11:51:39AM -0700, H. Peter Anvin wrote: > On 04/13/2011 10:21 AM, Joerg Roedel wrote: > > > > First of all, I bisected between v2.6.37-rc2..f005fe12b90c which where > > only a couple of patches and merged v2.6.38-rc4 in at every step. There > > was no failure found. > > Then I tried this again, but this time I merged v2.6.38-rc5 at every > > step and was successful. The bad commit in this branch turned out to be > > > > 1a4a678b12c84db9ae5dce424e0e97f0559bb57c > > > > which is related to memblock. > > > > Then I tried to find out which change between 2.6.38-rc4 and 2.6.38-rc5 > > is needed to trigger the failure, so I used f005fe12b90c as a base, > > bisected between v2.6.38-rc4..v2.6.38-rc5 and merged every bisect step > > into the base and tested. Here the bad commit turned out to be > > > > e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20 > > > > which is related to gart. It turned out that the gart aperture on that > > box is on another position with these patches. Before it was as > > 0xa400 and now it is at 0xa000. It seems like this has something > > to do with the root-cause. > > > > Reverting commit 1a4a678b12c84db9ae5dce424e0e97f0559bb57c fixes the > > problem btw. and booting with iommu=soft also works, but I have no idea > > yet why the aperture at that address is a problem (with the patch > > reverted the aperture lands at 0x8000). > > > > Does reverting e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20 solve the > problem for you? No, reverting that patch doesn't make the problem go away (and the gart aperture is still on 0xa000). I tested this in 39-rc3, I havn't tested if it makes a difference on the original bisect-commit from Ingo, probably it does (don't know if that matters). Strange about this commit is that it fixes an x86 gart aperture allocation bug in generic memblock code. > 1a4a678b12c84db9ae5dce424e0e97f0559bb57c is a memory-allocation-order > patch, which have a nasty tendency to unmask bugs elsewhere in the > kernel. However, e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20 looks > positively strange (and it doesn't exactly help that the description is > written in Yinghai-ese and is therefore nearly impossible to decode, > never mind tell if it is remotely correct.) I think that the two commits are okay and the bug is somewhere else, but I have no idea yet were to look next. I spent some time looking at radeon code and talking to Alex about it (because it seemed suspicous that the GTT is on 0xa000 too, but as Alex explained me this is an address in the GPU address space and shouldn't matter). Regards, Joerg
Linux 2.6.39-rc3
On Wed, Apr 13, 2011 at 11:39:29AM -0700, H. Peter Anvin wrote: > On 04/13/2011 10:21 AM, Joerg Roedel wrote: > > On Wed, Apr 13, 2011 at 08:46:09AM +0200, Ingo Molnar wrote: > >> Could you please send the before/after bootlog (in particular all memory > >> init > >> messages included) and your .config? > >> > >> before: f005fe12b90c: x86-64: Move out cleanup higmap [_brk_end, _end) > >> out of init_memory_mapping() > >> after: d2137d5af425: Merge branch 'linus' into x86/bootmem > >> > >> I've Cc:-ed more people who might have an idea about it. > > > > Okay, I have done some more bisecting and debugging today. > > > > First of all, *huge* thanks for this effort. At least we need to track > down the bits that need to be reverted -- it is past rc3, and it's time > to see what we should revert and tell the submitter to try again next cycle. > > This looks to be the same issue as in bugzilla 33012: > > https://bugzilla.kernel.org/show_bug.cgi?id=33012 > > ... so it would be good if we could keep the information in there. Yes, I try to find my korg bugzilla account again and drop the information from this email there. Joerg
Linux 2.6.39-rc3
On Wed, Apr 13, 2011 at 12:14:55PM -0700, Yinghai Lu wrote: > thanks for the bisecting... > > so those two patches uncover some problems. > > [0.00] Checking aperture... > [0.00] No AGP bridge found > [0.00] Node 0: aperture @ a000 size 32 MB > [0.00] Aperture pointing to e820 RAM. Ignoring. > [0.00] Your BIOS doesn't leave a aperture memory hole > [0.00] Please enable the IOMMU option in the BIOS setup > [0.00] This costs you 64 MB of RAM > [0.00] memblock_x86_reserve_range: [0xa000-0xa3ff] > aperture64 > [0.00] Mapping aperture over 65536 KB of RAM @ a000 > > so kernel try to reallocate apperture. because BIOS allocated is pointed to > RAM or size is too small. It is actually beyond 4GB on that machine, this value read here is from the previous kernel-boot. The BIOS does not reset these values on a reboot. > but your radeon does use [0xa000, 0xbfff) Yes, I suspected that too (and spent a few hours reading radeon code), but then I talked the Alex Deucher and he explained that these addresses which the driver prints for GTT and VRAM are in the GPU address space and do not refer to system ram. So this shouldn't be the problem. Joerg
Linux 2.6.39-rc3
On Wed, Apr 13, 2011 at 3:14 PM, Yinghai Lu wrote: > On 04/13/2011 10:21 AM, Joerg Roedel wrote: >> On Wed, Apr 13, 2011 at 08:46:09AM +0200, Ingo Molnar wrote: >> First of all, I bisected between v2.6.37-rc2..f005fe12b90c which where >> only a couple of patches and merged v2.6.38-rc4 in at every step. There >> was no failure found. >> Then I tried this again, but this time I merged v2.6.38-rc5 at every >> step and was successful. The bad commit in this branch turned out to be >> >> ? ? ? 1a4a678b12c84db9ae5dce424e0e97f0559bb57c >> >> which is related to memblock. >> >> Then I tried to find out which change between 2.6.38-rc4 and 2.6.38-rc5 >> is needed to trigger the failure, so I used f005fe12b90c as a base, >> bisected between v2.6.38-rc4..v2.6.38-rc5 and merged every bisect step >> into the base and tested. Here the bad commit turned out to be >> >> ? ? ? e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20 >> >> which is related to gart. It turned out that the gart aperture on that >> box is on another position with these patches. Before it was as >> 0xa400 and now it is at 0xa000. It seems like this has something >> to do with the root-cause. >> >> Reverting commit 1a4a678b12c84db9ae5dce424e0e97f0559bb57c fixes the >> problem btw. and booting with iommu=soft also works, but I have no idea >> yet why the aperture at that address is a problem (with the patch >> reverted the aperture lands at 0x8000). >> >> I have put some debug-data online. There is my .config and two >> dmesg-files for good (==2.6.39-rc3 + revert) and bad (==2.6.39-rc3) >> I also created these dmesg-files again with memblock=debug, maybe that >> helps to find the problem. The files are at >> >> ? ? ? http://www.8bytes.org/~joro/debug/ > > thanks for the bisecting... > > so those two patches uncover some problems. > > [ ? ?0.00] Checking aperture... > [ ? ?0.00] No AGP bridge found > [ ? ?0.00] Node 0: aperture @ a000 size 32 MB > [ ? ?0.00] Aperture pointing to e820 RAM. Ignoring. > [ ? ?0.00] Your BIOS doesn't leave a aperture memory hole > [ ? ?0.00] Please enable the IOMMU option in the BIOS setup > [ ? ?0.00] This costs you 64 MB of RAM > [ ? ?0.00] ? ? memblock_x86_reserve_range: [0xa000-0xa3ff] ? ? ? > aperture64 > [ ? ?0.00] Mapping aperture over 65536 KB of RAM @ a000 > > so kernel try to reallocate apperture. because BIOS allocated is pointed to > RAM or size is too small. > > but your radeon does use [0xa000, 0xbfff) > > [ ? ?4.281993] radeon :01:05.0: VRAM: 320M 0xC000 - > 0xD3FF (320M used) > [ ? ?4.290672] radeon :01:05.0: GTT: 512M 0xA000 - > 0xBFFF > [ ? ?4.298550] [drm] Detected VRAM RAM=320M, BAR=256M > [ ? ?4.309857] [drm] RAM width 32bits DDR > [ ? ?4.313748] [TTM] Zone ?kernel: Available graphics memory: 1896524 kiB. > [ ? ?4.320379] [TTM] Initializing pool allocator. > [ ? ?4.324948] [drm] radeon: 320M of VRAM memory ready > [ ? ?4.329832] [drm] radeon: 512M of GTT memory ready. > > and the one seems working: > > [ ? ?0.00] Checking aperture... > [ ? ?0.00] No AGP bridge found > [ ? ?0.00] Node 0: aperture @ a000 size 32 MB > [ ? ?0.00] Aperture pointing to e820 RAM. Ignoring. > [ ? ?0.00] Your BIOS doesn't leave a aperture memory hole > [ ? ?0.00] Please enable the IOMMU option in the BIOS setup > [ ? ?0.00] This costs you 64 MB of RAM > [ ? ?0.00] ? ? memblock_x86_reserve_range: [0x8000-0x83ff] ? ? ? > aperture64 > [ ? ?0.00] Mapping aperture over 65536 KB of RAM @ 8000 > [ ? ?0.00] ? ? memblock_x86_reserve_range: [0xacb6bdc0-0xacb6bddf] ? ? ? > ? ?BOOTMEM > > will use different position... > > [ ? ?4.250159] radeon :01:05.0: VRAM: 320M 0xC000 - > 0xD3FF (320M used) > [ ? ?4.258830] radeon :01:05.0: GTT: 512M 0xA000 - > 0xBFFF > [ ? ?4.266742] [drm] Detected VRAM RAM=320M, BAR=256M > [ ? ?4.271549] [drm] RAM width 32bits DDR > [ ? ?4.275435] [TTM] Zone ?kernel: Available graphics memory: 1896526 kiB. > [ ? ?4.282066] [TTM] Initializing pool allocator. > [ ? ?4.282085] usb 7-2: new full speed USB device number 2 using ohci_hcd > [ ? ?4.293076] [drm] radeon: 320M of VRAM memory ready > [ ? ?4.298277] [drm] radeon: 512M of GTT memory ready. > [ ? ?4.303218] [drm] Supports vblank timestamp caching Rev 1 (10.10.2010). > [ ? ?4.309854] [drm] Driver supports precise vblank timestamp query. > [ ? ?4.315970] [drm] radeon: irq initialized. > [ ? ?4.320094] [drm] GART: num cpu pages 131072, num gpu pages 131072 > > So question is why radeon is using the address [0xa000 - 0xc00], and > in E820 it is RAM The VRAM and GTT addresses in the dmesg are internal GPU addresses not system addresses. The GPU has it's own internal address space for on-chip memory clients (texture samplers, render buffers, display controllers, etc.). The GPU sets up two apertures in it's internal addres
Revert 737a3bb9416ce2a7c7a4170852473a4fcc9c67e8 ?
Michel D?nzer wrote: >>> That does sound like the GPU locks up. Do you get any messages in dmesg >>> about lockups and attempts to reset the GPU at any time? >> >> No. > > Hmm, I guess the constant SIGALRMs might prevent the lockup detection > from kicking in... Maybe you can try starting the X server with > -dumbSched to see if that gets things along any further, but in the end > there's probably no way around figuring out what causes the lockup and > fixing that anyway. I have an old AGP box that locks with 600g + agpgart - It used to give GPU lockup to dmesg/log, but (I only test it occasionally) it doesn't anymore. I can still sysrq OK. I wonder if something changed in recent months in the drm/whatever code that has changed/blocked the logging.