On Sat, Dec 29, 2012 at 1:32 AM, epsilon <epsilo...@t-online.de> wrote:
> recently we read a lot of total system freezes. Let me try to
> summarize:
>
> Common in many cases is: The system totally freezes. No keyboard
> interaction possible. No kernel panic. No coredump. Nothing in the
> logs. Network (ICMP, routing) looks up. But no userland action.

Hmm, oddly, the person that started that thread, frantisek holop,
figured out that their system *was* panicing and provided enough
information that I've started bouncing around an idea about a possible
cause with some other developers.


> Different are the situations: Some users observe this during boot,
> others in X during night, some see a high diskio just before the
> freeze, others see heavy network load. Some systems run in a VM,
> others on real hardware. Sometimes the issue is reproducable at the
> same time during night, in other cases it occurs randomly.
>
> So we have a wide variety of situations, but often the same result:
> Total freeze without any log or coredump.
>
> Let's assume all this cases have someting in common. Than something
> very fundamental is broken.
>
> On the other hand, is it really likely all this cases are different
> bugs?

Your case, as far as you described it, is not the same as frantisek holop's.


> To the developers: What is to provide if users did not have anything
> in their logs, no cordeump, nothing. Only a total frozen system? Maybe
> dmesg and config files, right? And a verbal description what happens,
> right?

Most of the descriptions I've seen have been too imprecise to help in diagnosis.
 "It freezes somewhere after "starting network daemons" and "starting
local daemons". I
  tried to disable services I do not essentially need or to substitute
  them with other solutions. So far no findings here."

Freezes 'somewhere'?  Hard to make hypotheses about the cause when
we're not told what processes were started, or whether it's consistent
from freeze to freeze.  If you turn on ddb.console=1 in sysctl.conf
can you break into ddb when it hangs?  What's trace and ps show in
that case?  show bcstats?  If you've performed tests of various sorts,
what did they show?  Negative results are sometimes _more_ important
than positive results; why bother doing a test if you're going to
throw out the result?  What hypotheses have been *excluded* by your
test results?

The title of the original thread was "snapshots total freeze", but
there were dmesg's in the thread showing Aug kernel builds; for those
who haven't tried running a (recent) snapshot, does your problem
reproduce or change symptoms when you do?

Is this consistent across hardware?  Drop another machine into place
where the freezing one is; does it freeze too?


Philip Guenther

Reply via email to