On Sun, Apr 13, 2008 at 05:55:36AM +0200, NN_il_Confusionario wrote:
> On Sat, Apr 12, 2008 at 11:23:48PM +0000, [EMAIL PROTECTED] wrote:
> > [EMAIL PROTECTED]:~$ free -b
> >              total       used       free     shared    buffers     cached
> > Mem:    1061478400  311463936  750014464          0  100552704  105132032
> > -/+ buffers/cache:  105779200  955699200
> > Swap:    699138048          0  699138048
> >  . .Detected 1495.263 MHz processor.
>
> For my standards this is a very modern and powerful box.
>
> If memterst86 (or memtest from memtester package, if you cannot spare
> the box) and the check of logs does not show anything, I will
> _temporarilly_ try another kernel (a newer one from etch-and-half,
> backports and/or an older one from sarge; or even the suse kernel that
> was running fine before) to understand if a bug report agaisnt the
> current kernel in etch is needed

Further to my previous, I gave it a try with 300M:

[EMAIL PROTECTED]:~$ sudo memtest 300M -l
memtest v. 2.93.1
(C) 2000 Charles Cazabon <[EMAIL PROTECTED]>
Original v.1 (C) 1999 Simon Kirby <[EMAIL PROTECTED]> <[EMAIL PROTECTED]>

Current limits:
  RLIMIT_RSS  0xffffffff
  RLIMIT_VMEM 0xffffffff
Raising limits...
Allocated 314572800 bytes...trying mlock...success.  Starting tests...

Testing 314568704 bytes at 0xa51da000 (4088 bytes lost to page alignment).

Run    1:
  Test  1:         Stuck Address:  Testing...Passed.
  Test  2:          Random value:  Setting...Testing...Passed.
  Test  3:        XOR comparison:  Setting...Testing...Passed.
  Test  4:        SUB comparison:  Setting...Testing...Passed.
  Test  5:        MUL comparison:  Setting...Testing...Passed.
  Test  6:        DIV comparison:  Setting...Testing...Passed.
  Test  7:         OR comparison:  Setting...Testing...Passed.
  Test  8:        AND comparison:  Setting...Testing...Passed.
  Test  9:  Sequential Increment:  Setting...Testing...Passed.
  Test 10:            Solid Bits:  Testing...Passed.
  Test 11:      Block Sequential:  Testing...  15

free showed a reasonable amount of memory still in the buffer pool:
[EMAIL PROTECTED]:/var/log$ free -b
             total       used       free     shared    buffers     cached
Mem:    1061478400 1045397504   16080896          0  269635584  139096064
-/+ buffers/cache:  636665856  424812544
Swap:    699138048          0  699138048

So I tried upping the memtest to 500M.....

Received signal 2 (Interrupt)
munlock'ed memory.
0 runs completed.  0 errors detected.  Total runtime:  130 seconds.

Exiting...
[EMAIL PROTECTED]:~$ sudo memtest 500M -l
memtest v. 2.93.1
(C) 2000 Charles Cazabon <[EMAIL PROTECTED]>
Original v.1 (C) 1999 Simon Kirby <[EMAIL PROTECTED]> <[EMAIL PROTECTED]>

Current limits:
  RLIMIT_RSS  0xffffffff
  RLIMIT_VMEM 0xffffffff
Raising limits...
Allocated 524288000 bytes...trying mlock...
Message from [EMAIL PROTECTED] at Sun Apr 13 15:38:38 2008 ...
tuko kernel: Oops: 0002 [#1]

Message from [EMAIL PROTECTED] at Sun Apr 13 15:38:38 2008 ...
tuko kernel: EIP is at _spin_lock+0x1/0xf

Message from [EMAIL PROTECTED] at Sun Apr 13 15:38:38 2008 ...
tuko kernel: eax: 00000044   ebx: 00000000   ecx: 00000001   edx: e7893d98

Message from [EMAIL PROTECTED] at Sun Apr 13 15:38:38 2008 ...
tuko kernel: esi: e7893d98   edi: 00000025   ebp: 00000025   esp: dfa67f00

Message from [EMAIL PROTECTED] at Sun Apr 13 15:38:38 2008 ...
tuko kernel: ds: 007b   es: 007b   ss: 0068

Message from [EMAIL PROTECTED] at Sun Apr 13 15:38:38 2008 ...
tuko kernel: Process kswapd0 (pid: 122, ti=dfa66000 task=dff98550 
task.ti=dfa66000)

Message from [EMAIL PROTECTED] at Sun Apr 13 15:38:38 2008 ...
tuko kernel: Stack: c015e27d e7893ca4 00000000 c016f31e 00000080 e7893ea4 
e7887ab4 0001a004 

Message from [EMAIL PROTECTED] at Sun Apr 13 15:38:38 2008 ...
tuko kernel:        dfffeac0 00000088 000000d0 c0148ca8 00680100 00000000 
00680100 00031357 

Message from [EMAIL PROTECTED] at Sun Apr 13 15:38:38 2008 ...
tuko kernel:        00000080 00000000 00000000 c02ccec0 c02ccec0 00000003 
c0149053 00000000 

Message from [EMAIL PROTECTED] at Sun Apr 13 15:38:38 2008 ...
tuko kernel: Call Trace:

Message from [EMAIL PROTECTED] at Sun Apr 13 15:38:38 2008 ...
tuko kernel: Code: 05 90 ff 02 30 c9 89 c8 c3 89 c2 90 81 28 00 00 00 01 0f 94 
c0 84 c0 b9 01 00 00 00 75 09 90 81 02 00 00 00 01 30 c9 89 c8 c3 90 <fe> 08 79 
09 f3 90 80 38 00 7e f9 eb f2 c3 90 81 28 00 00 00 01 

Message from [EMAIL PROTECTED] at Sun Apr 13 15:38:38 2008 ...
tuko kernel: EIP: [<c028091a>] _spin_lock+0x1/0xf SS:ESP 0068:dfa67f00


Same error instantly... so I am guessing that a memory error would have
been detected more gracefully, and this is more likely to indicate something
going seriously wrong when kswapd becomes active...

It looks like the system is still running, but any attempt to access the
hard drive gets stuck in an uninteruptible sleep.

At least the problem seems to be easily reproduced...

Regards,
DigbyT


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED] 
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to