> > 
> > FYI, I see a similar problem under 2.4.5, also SMP, although only
> > intermittently.  Two oopses are below, from two different, although
> > similarly configured, machines.
> 
> [snip]
> 
> Sounds very similar.  Our servers are all identical (except for RAM).
> 
> What's unusual is that the machines we *expect* to fail sooner - don't
> (not even an oops).  Those are very busy cache servers (several of them
> in a sibling cluster) which do a lot of swapping.  The machines which
> *do* fail (or oops without any further catastrophe) are typically
> web/mail hosting servers (reasonably busy with about 25% swap being
> used).  Increasing swap did not help on 2.4.5.  We're still waiting for
> something to happen on 2.4.6 (ie, oops already appeared - waiting for
> meltdown, which, hopefully, will not occur).  We used to auto-reboot
> every morning at 2am or something to keep things stable - which I
> *hate* because I remember having a 2.0.35/6 workstation that had an
> uptime of 6 months a couple of years ago.  God, I loved that box.
> 

It's happened again.  The server which previously failed with memory
errors, has failed again and required a reboot.  It was using 26% swap,
and apache would fail to start with 'semget: No space left on device'. 
What we also noticed was that the kswapd process showed 'defunct' on
ps...  mean anything to anyone?

Regards
Henry

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to