On Fri, Aug 08, 2008 at 08:51:33PM +0200, Peter Palfrader wrote: > Package: linux-image-2.6.26-1-amd64 > Version: 2.6.26-1 > Severity: important > > Hi, > > it seems that 2.6.26 (whether the debian package or the kernel.org > kernel) locks up after a while on Debian's DL385G1 systems. > > After a while, sooner with more disk IO/filesystem load, the system > hangs: it continues to do stuff but everything involving disk hangs > forever. > > The systems work just fine on a 2.6.25.10 kernel. > > The servers have Opterons like this: > cpu family : 15 > model : 33 > > so http://www.uwsg.iu.edu/hypermail/linux/kernel/0808.0/0882.html might > explain it.
hey Peter, This is readily reproducible - a simple kernel compile was all it took. git bisecting suggests that this issue was introduced by [1] and unmasked by [2] during 2.6.26 devlopment. It was later fixed during 2.6.27 development by [3]. Can you confirm that the attached backport of [3] fixes the problem for you? [1] 35605a1027ac630f85a1b95684f7e86b82498cd6 [2] 8d539108560ec121d59eee05160236488266221c [3] 8004dd965b13b01a96def054d420f6df7ff22d53 -- dann frazier
commit 8004dd965b13b01a96def054d420f6df7ff22d53 Author: Yinghai Lu <[EMAIL PROTECTED]> Date: Mon May 12 17:40:39 2008 -0700 x86: amd opteron TOM2 mask val fix there is a typo in the mask value, need to remove that extra 0, to avoid 4bit clearing. Signed-off-by: Yinghal Lu <[EMAIL PROTECTED]> Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]> Backported to Debian's 2.6.26 by dann frazier <[EMAIL PROTECTED]> diff -urpN linux-source-2.6.26.orig/arch/x86/kernel/cpu/mtrr/generic.c linux-source-2.6.26/arch/x86/kernel/cpu/mtrr/generic.c --- linux-source-2.6.26.orig/arch/x86/kernel/cpu/mtrr/generic.c 2008-08-11 22:55:59.000000000 -0600 +++ linux-source-2.6.26/arch/x86/kernel/cpu/mtrr/generic.c 2008-08-11 22:57:13.000000000 -0600 @@ -219,7 +219,7 @@ void __init get_mtrr_state(void) tom2 = hi; tom2 <<= 32; tom2 |= lo; - tom2 &= 0xffffff8000000ULL; + tom2 &= 0xffffff800000ULL; } if (mtrr_show) { int high_width; diff -urpN linux-source-2.6.26.orig/arch/x86/pci/k8-bus_64.c linux-source-2.6.26/arch/x86/pci/k8-bus_64.c --- linux-source-2.6.26.orig/arch/x86/pci/k8-bus_64.c 2008-08-11 22:55:59.000000000 -0600 +++ linux-source-2.6.26/arch/x86/pci/k8-bus_64.c 2008-08-11 22:57:13.000000000 -0600 @@ -384,7 +384,7 @@ static int __init early_fill_mp_bus_info /* need to take out [0, TOM) for RAM*/ address = MSR_K8_TOP_MEM1; rdmsrl(address, val); - end = (val & 0xffffff8000000ULL); + end = (val & 0xffffff800000ULL); printk(KERN_INFO "TOM: %016lx aka %ldM\n", end, end>>20); if (end < (1ULL<<32)) update_range(range, 0, end - 1); @@ -478,7 +478,7 @@ static int __init early_fill_mp_bus_info /* TOP_MEM2 */ address = MSR_K8_TOP_MEM2; rdmsrl(address, val); - end = (val & 0xffffff8000000ULL); + end = (val & 0xffffff800000ULL); printk(KERN_INFO "TOM2: %016lx aka %ldM\n", end, end>>20); update_range(range, 1ULL<<32, end - 1); }