On Fri, Jul 07, 2000 at 10:17:37AM -0700, Eric Councell wrote:
>
> I have ordered replacement memory for the machine, but I
> am curious as to any other possible causes.

I know you were asking for other suggestions, but there's a convenient,
(except it requires a reboot) way to test your memory under linux.
You simply take any large tarfile of text files, (such as the linux
kernel), extract it into one directory, and then extract it again,
multiple times, into a second directory, doing recursive diffs after
every extract.  If there are differences between the two directories,
then you've probably got a memory problem.

One annoyance is that to do the test right, you would want to disable
all memory caches from your BIOS.  In other words, you would need to
schedule downtime for the reboot(s).  (It's even better if you can run
the test with the different caches enabled one by one.  If the problem
shows up when you have all caches disabled then you probably have a
memory problem, although it could still be other hardware, but if the
problem shows up with only one particular cache enabled, then you'll
know you have a bad motherboard or cache or cpu.)

I've attached part of a linux-kernel thread from a while back
that describes the test with sample code--Doug Ledford suggests
and discusses the following script in the second email in the
attached thread:

  #!/bin/sh
  cd /tmp
  tar xzf linux-2.1.123.tar.gz
  mv linux linux.save
  for i in 1 2 3 4 5 6 7 8 9 10
  do
    tar xzf linux-2.1.123.tar.gz
    diff -U 3 -rN linux.save linux
  done

(Note that with some kernels, you can get some ignorable errors
associated with tar extracts and permissions.)

Good luck.

 -Mark Shewmaker
  [EMAIL PROTECTED]
>From [EMAIL PROTECTED]  Tue Sep 29 06:02:54 1998
Received: from listserv.funet.fi (listserv.funet.fi [128.214.248.27])
        by primefactor.com (8.8.7/8.8.7) with ESMTP id GAA32121
        for <[EMAIL PROTECTED]>; Tue, 29 Sep 1998 06:02:53 -0400
Received: from vger.rutgers.edu ([128.6.190.2]:59969 "EHLO vger.rutgers.edu" ident: 
"NO-IDENT-SERVICE[2]") by listserv.funet.fi with ESMTP id <10715-6289>; Tue, 29 Sep 
1998 13:00:37 +0300
Received: by vger.rutgers.edu id <154750-4055>; Tue, 29 Sep 1998 00:21:48 -0400
Received: from 3dyn43.delft.casema.net ([195.96.104.43]:26269 "EHLO 
rosie.BitWizard.nl" ident: "root") by vger.rutgers.edu with ESMTP id <154875-4055>; 
Mon, 28 Sep 1998 23:36:05 -0400
Received: from cave.BitWizard.nl ([EMAIL PROTECTED] [130.161.127.248])
        by rosie.BitWizard.nl (8.8.5/8.8.5) with ESMTP id KAA04807;
        Tue, 29 Sep 1998 10:17:42 +0200
Received: (from wolff@localhost)
        by cave.BitWizard.nl (8.8.8/8.8.8) id KAA00648;
        Tue, 29 Sep 1998 10:17:48 +0200
Message-Id: <[EMAIL PROTECTED]>
Subject: Re: utility for testing ram?
In-Reply-To: <[EMAIL PROTECTED]> from Henrik 
Olsen at "Sep 29, 98 01:43:34 am"
To: [EMAIL PROTECTED] (Henrik Olsen)
Date:   Tue, 29 Sep 1998 10:17:48 +0200 (MEST)
Cc: [EMAIL PROTECTED], [EMAIL PROTECTED]
From: [EMAIL PROTECTED] (Rogier Wolff)
X-Mailer: ELM [version 2.4ME+ PL37 (25)]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
X-Orcpt: rfc822;[EMAIL PROTECTED]
Sender: [EMAIL PROTECTED]
Precedence: bulk
X-Loop: [EMAIL PROTECTED]
Status: RO
Content-Length: 1481
Lines: 56

Henrik Olsen wrote:
> On Mon, 28 Sep 1998, Ricardo Kleemann wrote:
> 
> > Hi,
> > 
> > Anyone know of a utility to extensively test a system's ram, in order to 
> > determine whether the ram has any faults?
> 
> The classic test is to compile the kernel repeatedly, as mentioned several
> times before, this will give the machine a thorough workout, including
> running at close to 100% cpu use for a long time, making for no cooldown
> in idling, which will make marginal components even more likely to fail.
> 
> A simple script to do continuous testing would be:
> 
> #!/bin/sh
> while true
> do
>   make clean
>   make
> done
> 
> Start it running overnight, if it's still running when you wake up, your
> memory's likely to be ok.

No.

MOst likely gcc will crash, give an aborted message and the make
aborts the current build, but your make clean, next make will clear all
traces of this going wrong.... 

Try this:

#!/bin/sh
t=0 
while true
do
  make clean
  make 2>&1 > log.$t
  t=`expr $t + 1`
done

The logs should end up all being identical.....

                                Roger.


-- 
| Most people would die sooner than think....  |    [EMAIL PROTECTED] 
| in fact, most do.  -- Bertrand Russsell      |     phone: +31-15-2137555 
We write Linux device drivers for any device you may have! fax: ..-2138217

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

>From [EMAIL PROTECTED]  Tue Sep 29 02:59:09 1998
Received: from listserv.funet.fi (listserv.funet.fi [128.214.248.27])
        by primefactor.com (8.8.7/8.8.7) with ESMTP id CAA30095
        for <[EMAIL PROTECTED]>; Tue, 29 Sep 1998 02:59:08 -0400
Received: from vger.rutgers.edu ([128.6.190.2]:51319 "EHLO vger.rutgers.edu" ident: 
"NO-IDENT-SERVICE[2]") by listserv.funet.fi with ESMTP id <10841-6289>; Tue, 29 Sep 
1998 09:57:22 +0300
Received: by vger.rutgers.edu id <154633-4055>; Mon, 28 Sep 1998 20:58:52 -0400
Received: from dledford.dialnet.net ([206.65.249.116]:9198 "EHLO dledford.dialnet.net" 
ident: "root") by vger.rutgers.edu with ESMTP id <154899-4055>; Mon, 28 Sep 1998 
19:51:45 -0400
Received: from dialnet.net (dledford@localhost [127.0.0.1])
        by dledford.dialnet.net (8.8.7/8.8.7) with ESMTP id XAA22495;
        Mon, 28 Sep 1998 23:30:18 -0500
Message-ID: <[EMAIL PROTECTED]>
Date:   Mon, 28 Sep 1998 23:30:18 -0500
From: Doug Ledford <[EMAIL PROTECTED]>
X-Mailer: Mozilla 4.06 [en] (X11; I; Linux 2.0.35 i686)
MIME-Version: 1.0
To: Henrik Olsen <[EMAIL PROTECTED]>
CC: Ricardo Kleemann <[EMAIL PROTECTED]>, [EMAIL PROTECTED]
Subject: Re: utility for testing ram?
References: <[EMAIL PROTECTED]>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Orcpt: rfc822;[EMAIL PROTECTED]
Sender: [EMAIL PROTECTED]
Precedence: bulk
X-Loop: [EMAIL PROTECTED]
Status: RO
Content-Length: 2216
Lines: 62

Henrik Olsen wrote:
> 
> On Mon, 28 Sep 1998, Ricardo Kleemann wrote:
> 
> > Hi,
> >
> > Anyone know of a utility to extensively test a system's ram, in order to
> > determine whether the ram has any faults?
> 
> The classic test is to compile the kernel repeatedly, as mentioned several
> times before, this will give the machine a thorough workout, including
> running at close to 100% cpu use for a long time, making for no cooldown
> in idling, which will make marginal components even more likely to fail.
> 
> A simple script to do continuous testing would be:
> 
> #!/bin/sh
> while true
> do
>   make clean
>   make
> done
> 
> Start it running overnight, if it's still running when you wake up, your
> memory's likely to be ok.

No, no, and NO!  If you want to test your RAM, you can't run some test that
is CPU power limited.  You'll never access your RAM here faster than the CPU
can compile the kernel, and I got news for people out there.  There ain't no
CPU yet that compiles a kernel faster than your RAM can read/write those
source code and object code pages.  This is a good CPU test, not a good RAM
test.  If your RAM fails during this test with Sig11's or whatever, then it
really wasn't marginal to begin with.  I know I've posted this test to the
list before, but without someone posting a better test, I still claim that
your best memory tester that exists is this script:

#!/bin/sh
cd /tmp
tar xzf linux-2.1.123.tar.gz
mv linux linux.save
for i in 1 2 3 4 5 6 7 8 9 10
do
  tar xzf linux-2.1.123.tar.gz
  diff -U 3 -rN linux.save linux
done

If that script spews anything to the screen, you've failed your memory
test.  The only exception to this is if your disk sub-system doesn't use
DMA, then this test is not as good as it could be, but if your system uses
DMA (such as a decent SCSI controller, or DMA IDE) then this test will show
bad RAM much faster and more reliably than compiling a kernel.

-- 

 Doug Ledford  <[EMAIL PROTECTED]>
  Opinions expressed are my own, but
     they should be everybody's.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

>From [EMAIL PROTECTED]  Tue Sep 29 06:06:19 1998
Received: from listserv.funet.fi (listserv.funet.fi [128.214.248.27])
        by primefactor.com (8.8.7/8.8.7) with ESMTP id GAA32125
        for <[EMAIL PROTECTED]>; Tue, 29 Sep 1998 06:06:18 -0400
Received: from vger.rutgers.edu ([128.6.190.2]:59969 "EHLO vger.rutgers.edu" ident: 
"NO-IDENT-SERVICE[2]") by listserv.funet.fi with ESMTP id <10507-5487>; Tue, 29 Sep 
1998 13:03:40 +0300
Received: by vger.rutgers.edu id <154753-4055>; Tue, 29 Sep 1998 00:21:57 -0400
Received: from ferret.lmh.ox.ac.uk ([163.1.138.204]:19424 "HELO ferret.lmh.ox.ac.uk" 
ident: "qmailr") by vger.rutgers.edu with SMTP id <154897-4055>; Mon, 28 Sep 1998 
23:49:35 -0400
Received: (qmail 1713 invoked by uid 504); 29 Sep 1998 08:31:33 -0000
Received: from localhost ([EMAIL PROTECTED])
  by localhost with SMTP; 29 Sep 1998 08:31:32 -0000
Date:   Tue, 29 Sep 1998 09:31:32 +0100 (GMT)
From: Matthew Kirkwood <[EMAIL PROTECTED]>
To: Ricardo Kleemann <[EMAIL PROTECTED]>
cc: [EMAIL PROTECTED]
Subject: Re: utility for testing ram?
In-Reply-To: <[EMAIL PROTECTED]>
Message-ID: <[EMAIL PROTECTED]>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Orcpt: rfc822;[EMAIL PROTECTED]
Sender: [EMAIL PROTECTED]
Precedence: bulk
X-Loop: [EMAIL PROTECTED]
Status: RO
Content-Length: 1011
Lines: 29

On Mon, 28 Sep 1998, Ricardo Kleemann wrote:

> Anyone know of a utility to extensively test a system's ram, in order to
> determine whether the ram has any faults?

It's called gcc :)

Seriously, a 24-hour repeated kernel compile will stress your memory much
harder than things like memtest86 (sunsite:/pub/linux/handware/somewhere)
because the CPU, RAM, cache and various other peripherals are working
hard, and probably not following some easily identifiable pattern.

( while true; do
  make dep clean zImage modules
done ) >/dev/null 2>/tmp/errlog &

and wait.  After a day or so, grep /tmp/errlog for "signal" (usually 11 or
6).  Short of a gcc error (relatively unlikely :) each of those signals is
a bit-flip.  (Don't rely on al of them being caught, either.)

See http://bitwizard.nl/sig11/ for more information.

Matthew.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

>From [EMAIL PROTECTED]  Thu Oct  1 16:04:31 1998
Received: from listserv.funet.fi (listserv.funet.fi [128.214.248.27])
        by primefactor.com (8.8.7/8.8.7) with ESMTP id QAA14908
        for <[EMAIL PROTECTED]>; Thu, 1 Oct 1998 16:04:30 -0400
Received: from vger.rutgers.edu ([128.6.190.2]:47404 "EHLO vger.rutgers.edu" ident: 
"TIMEDOUT") by listserv.funet.fi with ESMTP id <10693-24079>; Thu, 1 Oct 1998 23:01:51 
+0300
Received: by vger.rutgers.edu id <154151-7446>; Thu, 1 Oct 1998 10:46:23 -0400
Received: from terrorist.math.ntu.edu.tw ([140.112.50.234]:4812 "EHLO 
terrorist.math.ntu.edu.tw" ident: "TIMEDOUT2") by vger.rutgers.edu with ESMTP id 
<154544-7446>; Thu, 1 Oct 1998 09:47:01 -0400
Received: (from root@localhost)
        by terrorist.math.ntu.edu.tw (8.8.5/8.8.5) id BAA12466;
        Fri, 2 Oct 1998 01:45:15 +0800
Date:   Fri, 2 Oct 1998 01:45:15 +0800
Message-Id: <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
CC: Doug Ledford <[EMAIL PROTECTED]>, [EMAIL PROTECTED]
From: [EMAIL PROTECTED]
In-reply-to: <[EMAIL PROTECTED]> 
([EMAIL PROTECTED])
Subject: Re: utility for testing ram?
X-Orcpt: rfc822;[EMAIL PROTECTED]
Sender: [EMAIL PROTECTED]
Precedence: bulk
Reply-To: [EMAIL PROTECTED]
X-Loop: [EMAIL PROTECTED]
Status: RO
Content-Length: 1191
Lines: 27

Thus spake Doug Ledford:
* I'll stand by my claim that my test will trounce a gcc compile test any day
* of the week for finding bad RAM %^) ... experience.  Find a machine that
* fails my test on one out of every four passes, and I'll show you a machine
* that will compile kernels all day long without a hiccup (x86 arch anyway).

Doug:

        It is amazing!  You are right, I had run your tests on my four
dual PPro boxen, and three of them did just fine while one had console
output.  No wonder I thought disk copies were corrupting data!  And it
(a SuperMicro P6DNE) question compiled kernel about seventy times in a
test with no problems whatsoever.  I am getting some parity EDO SIMMs
first out from Net Express-- the machine that had the problem was the
only of 4 that did not have ECC on (two were SuperMicro P6DOF's which
are Orion and parity FPM only, one was Intel Providence with EDO DIMMs
ECC and buffered).  [The 4 machines total 1Gig of RAM ....]

                                                        Thanks, B.Y.

PS also testing your 5.1.0-p13~


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

>From [EMAIL PROTECTED]  Tue Sep 29 06:06:19 1998
Received: from listserv.funet.fi (listserv.funet.fi [128.214.248.27])
        by primefactor.com (8.8.7/8.8.7) with ESMTP id GAA32125
        for <[EMAIL PROTECTED]>; Tue, 29 Sep 1998 06:06:18 -0400
Received: from vger.rutgers.edu ([128.6.190.2]:59969 "EHLO vger.rutgers.edu" ident: 
"NO-IDENT-SERVICE[2]") by listserv.funet.fi with ESMTP id <10507-5487>; Tue, 29 Sep 
1998 13:03:40 +0300
Received: by vger.rutgers.edu id <154753-4055>; Tue, 29 Sep 1998 00:21:57 -0400
Received: from ferret.lmh.ox.ac.uk ([163.1.138.204]:19424 "HELO ferret.lmh.ox.ac.uk" 
ident: "qmailr") by vger.rutgers.edu with SMTP id <154897-4055>; Mon, 28 Sep 1998 
23:49:35 -0400
Received: (qmail 1713 invoked by uid 504); 29 Sep 1998 08:31:33 -0000
Received: from localhost ([EMAIL PROTECTED])
  by localhost with SMTP; 29 Sep 1998 08:31:32 -0000
Date:   Tue, 29 Sep 1998 09:31:32 +0100 (GMT)
From: Matthew Kirkwood <[EMAIL PROTECTED]>
To: Ricardo Kleemann <[EMAIL PROTECTED]>
cc: [EMAIL PROTECTED]
Subject: Re: utility for testing ram?
In-Reply-To: <[EMAIL PROTECTED]>
Message-ID: <[EMAIL PROTECTED]>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Orcpt: rfc822;[EMAIL PROTECTED]
Sender: [EMAIL PROTECTED]
Precedence: bulk
X-Loop: [EMAIL PROTECTED]
Status: RO
Content-Length: 1011
Lines: 29

On Mon, 28 Sep 1998, Ricardo Kleemann wrote:

> Anyone know of a utility to extensively test a system's ram, in order to
> determine whether the ram has any faults?

It's called gcc :)

Seriously, a 24-hour repeated kernel compile will stress your memory much
harder than things like memtest86 (sunsite:/pub/linux/handware/somewhere)
because the CPU, RAM, cache and various other peripherals are working
hard, and probably not following some easily identifiable pattern.

( while true; do
  make dep clean zImage modules
done ) >/dev/null 2>/tmp/errlog &

and wait.  After a day or so, grep /tmp/errlog for "signal" (usually 11 or
6).  Short of a gcc error (relatively unlikely :) each of those signals is
a bit-flip.  (Don't rely on al of them being caught, either.)

See http://bitwizard.nl/sig11/ for more information.

Matthew.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Reply via email to