On 5/30/2012 8:55 AM, Seyyed Mohtadin Hashemi wrote: > On Tue, 2012-05-29 at 15:48 -0500, Stan Hoeppner wrote: >> On 5/29/2012 4:08 PM, Seyyed Mohtadin Hashemi wrote: >>> On Tue, 2012-05-15 at 21:26 +0200, Seyyed Mohtadin Hashemi wrote: >>>> On Tue, May 15, 2012 at 8:51 PM, Stan Hoeppner >>>> <s...@hardwarefreak.com> wrote: >>>> On 5/15/2012 12:26 PM, Seyyed Mohtadin Hashemi wrote: >>>> > On Tue, May 15, 2012 at 4:30 AM, Henrique de Moraes Holschuh >>>> <h...@debian.org >>>> >> wrote: >>>> > >>>> >> On Mon, 14 May 2012, Stan Hoeppner wrote: >>>> >>> On 5/13/2012 7:02 PM, Henrique de Moraes Holschuh wrote: >>>> >>>> On Fri, 11 May 2012, Seyyed Mohtadin Hashemi wrote: >>>> >>>>> On 5/10/2012 1:16 PM, Stan Hoeppner wrote: >>>> >>>>>> If this doesn't fix the issue, and memtest and other >>>> utils can see >>>> >> all >>>> >>>>>> 64GB just fine, then I'd say you're dealing with a BIOS >>>> bug. >>>> >>>>> >>>> >>>>> The very top of /var/log/dmesg has the kernel debug >>>> output about the >>>> >> memory >>>> >>>>> map. It might well tell us very quickly who is the >>>> culprit, if the >>>> >> user >>>> >>>>> with the problem can post it for the best working case >>>> and the >>>> >> non-working >>>> >>>>> [ 0.000000] e820 update range: 00000000e0000000 - >>>> 000000101f000000 >>>> >>>>> (usable) ==> (reserved) >>>> >>>>> [ 0.000000] WARNING: BIOS bug: CPU MTRRs don't cover >>>> all of memory, >>>> >>>>> losing 61936MB of RAM. >>>> >>>> >>>> >>>> There you have it. >>>> >>> >>>> >>> I'm not surprised I was correct WRT a BIOS bug, but I am a >>>> little >>>> >>> embarrassed I didn't know and suggest this would be >>>> reported in dmesg. >>>> >>> I admit I just don't see this very often--this being the >>>> 1st time >>>> >>> actually seeing this WARNING. >>>> >> >>>> >> Well, it is the first time I've seen a BIOS screw it up so >>>> badly as to >>>> >> have someone lose 61GiB of RAM over it. >>>> >> >>>> >>>> Any of the latest versions of the longterm kernels >>>> (2.6.32, 3.0), or >>>> >>>> latest 3.2 should be able to repair MTRRs properly, but >>>> you have to >>>> >>>> compile the kernel with that option enabled. It might be >>>> already >>>> >>>> available, but not enabled by default. In that case, >>>> this might help >>>> >>>> you: >>>> >>> >>>> >>> Yep. In vanilla 3.2.6 it's selected by default in >>>> menuconfig, and you >>>> >>> can't un-select it. >>>> >> >>>> >> We _really_ need to have that enabled by default on the >>>> Debian kernels >>>> >> IMO, if we don't enable it already. >>>> >> >>>> >> -- >>>> >> "One disk to rule them all, One disk to find them. One >>>> disk to bring >>>> >> them all and in the darkness grind them. In the Land of >>>> Redmond >>>> >> where the shadows lie." -- The Silicon Valley Tarot >>>> >> Henrique Holschuh >>>> >> >>>> > >>>> > Thank you for the tips Henrique and Stan, unfortunately i >>>> don't have time >>>> > to build/test new kernels this week because i have to finish >>>> my thesis. I >>>> > will have time next week to look at it and report back the >>>> results. >>>> >>>> >>>> In that case you could simply install the latest backport >>>> kernel image >>>> and see if that does the trick. Should be quick 'n painless. >>>> >>>> Add to /etc/apt/sources.list >>>> deb http://backports.debian.org/debian-backports >>>> squeeze-backports \ >>>> main contrib non-free >>>> >>>> $ aptitude update >>>> $ aptitude -t squeeze-backports install >>>> linux-image-3.2.0-0.bpo.1-amd64 >>>> $ shutdown -r now >>>> >>>> Should take less than 5 minutes. >>>> >>>> -- >>>> Stan >>>> >>>> >>>> Funny you should mention that, I did actually try the exact kernel you >>>> mentioned yesterday - it did not go well, i got kernel panic. I didn't >>>> do many tests because i didn't have much time, i went back to the old >>>> kernel, and though i'm not happy with the situation the computer at >>>> least works and i can use the CPU to do calculations. >>> >>> >>> Hi Stan, >>> >>> I RMA'd the MB and with the replacement I received I am able to run the >>> 3.2 kernel and all installed RAM is usable. However, I have to use >>> "noapic irqpoll acpi=force" boot flags. >> >> Needing some boot flags with some main boards isn't uncommon. And in >> fact using various boot flags used to be (maybe still is) needed to get >> Linux VMs running properly on VMWare ESX, specifically the system clock. >> So the boot flags are just a bare metal hardware issue. >> >>> I did have a small problem, sometimes I would get "RAM R/W test fail" at >>> BIOS POST. I had done extensive memtest on the DIMMs earlier so I only >>> tested if the individual DIMMs could POST, only one gave the "RAM R/W >>> test fail". After removing the faulty DIMM + a healthy DIMM the system >>> works smoothly. >> >> What replacement board board did you get? Another ASUS or a SuperMicro? >> > > I got another ASUS (same model), the only SuperMicro I could get at the > vendor was Supermicro H8DGU-F or quad CPU MBs - non of which I wanted.
So, are you certain the original ASUS board was defective? You may want to update the subject with SOLVED: and describe the fix. Glad it's working now. :) -- Stan -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4fc6c6e9.2010...@hardwarefreak.com