On Fri, Aug 08, 2008 at 08:51:33PM +0200, Peter Palfrader wrote:
> Package: linux-image-2.6.26-1-amd64
> Version: 2.6.26-1
> Severity: important
> 
> Hi,
> 
> it seems that 2.6.26 (whether the debian package or the kernel.org
> kernel) locks up after a while on Debian's DL385G1 systems.
> 
> After a while, sooner with more disk IO/filesystem load, the system
> hangs: it continues to do stuff but everything involving disk hangs
> forever.
> 
> The systems work just fine on a 2.6.25.10 kernel.
> 
> The servers have Opterons like this:
> cpu family      : 15
> model           : 33
> 
> so http://www.uwsg.iu.edu/hypermail/linux/kernel/0808.0/0882.html might
> explain it.

hey Peter,
 This is readily reproducible - a simple kernel compile was all it
took. git bisecting suggests that this issue was introduced by [1]
and unmasked by [2] during 2.6.26 devlopment. It was later fixed
during 2.6.27 development by [3].

Can you confirm that the attached backport of [3] fixes the problem
for you?

[1] 35605a1027ac630f85a1b95684f7e86b82498cd6
[2] 8d539108560ec121d59eee05160236488266221c
[3] 8004dd965b13b01a96def054d420f6df7ff22d53


-- 
dann frazier

commit 8004dd965b13b01a96def054d420f6df7ff22d53
Author: Yinghai Lu <[EMAIL PROTECTED]>
Date:   Mon May 12 17:40:39 2008 -0700

    x86: amd opteron TOM2 mask val fix
    
    there is a typo in the mask value, need to remove that extra 0,
    to avoid 4bit clearing.
    
    Signed-off-by: Yinghal Lu <[EMAIL PROTECTED]>
    Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>

Backported to Debian's 2.6.26 by dann frazier <[EMAIL PROTECTED]>

diff -urpN linux-source-2.6.26.orig/arch/x86/kernel/cpu/mtrr/generic.c 
linux-source-2.6.26/arch/x86/kernel/cpu/mtrr/generic.c
--- linux-source-2.6.26.orig/arch/x86/kernel/cpu/mtrr/generic.c 2008-08-11 
22:55:59.000000000 -0600
+++ linux-source-2.6.26/arch/x86/kernel/cpu/mtrr/generic.c      2008-08-11 
22:57:13.000000000 -0600
@@ -219,7 +219,7 @@ void __init get_mtrr_state(void)
                tom2 = hi;
                tom2 <<= 32;
                tom2 |= lo;
-               tom2 &= 0xffffff8000000ULL;
+               tom2 &= 0xffffff800000ULL;
        }
        if (mtrr_show) {
                int high_width;
diff -urpN linux-source-2.6.26.orig/arch/x86/pci/k8-bus_64.c 
linux-source-2.6.26/arch/x86/pci/k8-bus_64.c
--- linux-source-2.6.26.orig/arch/x86/pci/k8-bus_64.c   2008-08-11 
22:55:59.000000000 -0600
+++ linux-source-2.6.26/arch/x86/pci/k8-bus_64.c        2008-08-11 
22:57:13.000000000 -0600
@@ -384,7 +384,7 @@ static int __init early_fill_mp_bus_info
        /* need to take out [0, TOM) for RAM*/
        address = MSR_K8_TOP_MEM1;
        rdmsrl(address, val);
-       end = (val & 0xffffff8000000ULL);
+       end = (val & 0xffffff800000ULL);
        printk(KERN_INFO "TOM: %016lx aka %ldM\n", end, end>>20);
        if (end < (1ULL<<32))
                update_range(range, 0, end - 1);
@@ -478,7 +478,7 @@ static int __init early_fill_mp_bus_info
                /* TOP_MEM2 */
                address = MSR_K8_TOP_MEM2;
                rdmsrl(address, val);
-               end = (val & 0xffffff8000000ULL);
+               end = (val & 0xffffff800000ULL);
                printk(KERN_INFO "TOM2: %016lx aka %ldM\n", end, end>>20);
                update_range(range, 1ULL<<32, end - 1);
        }

Reply via email to