My suggestion, can't we create troubleshooting database?? On Sun, Aug 15, 2010 at 11:30 AM, Borden Rhodes <j...@bordenrhodes.com>wrote:
> Good morning, > > I'm going to list some of the frustrations I've been having with > troubleshooting Linux's quirks, crashes and problems in hopes that someone > may > be able to help me (and the community) become better bug reporters and > troubleshooters. I'll make comparisons to Windows only because I am used > to > fixing the same problems in Windows a certain way - maybe there are > analogies > in Linux or maybe I'm approaching these problems the wrong way. I'm not > trying to troll or flame-bait. I'm using Debian Squeeze, by the way. > > 1) Is there a way to apply debugging symbols retroactively to a dump? A few > times I've had Linux crash on me and spit out a debugging dump. I do my > best > to install debugging symbols for all 1400 packages I have on my system > (when I > can find them) but this requires a huge amount of hard disk space and, > invariably, the odd dump is missing symbols. Recreating the crash isn't > always possible. Is there (or could someone invent) a way to save a dump > without the symbols, download the symbol tables and then regenerate the > dump > with the symbols so it's useful to developers? > > 2) I find that the logs contain lots of facts but not a whole lot of useful > information (if any) when something goes wrong. I've had KDE go > black-screen > on me, for example, and force a hard reboot but there's no mention > whatsoever > (that I can find) in xorg.log, kdm.log, messages, syslog or dmesg. Windows > seems to be fairly good at making its last breath a stop error before it > dies > which means when I get back into the system (or when I'm looking at a > client's > computer days after) I can find that stop error, look it up and figure out > what > went wrong. Are Linux's logs designed for troubleshooting or only for > monitoring? Are proper troubleshooting logs kept somewhere else or in a > special file? Is there a guide on how to read Linux's logs so I can make > sense > out of them like I can Windows' logs? > > 3) Linux needs better troubleshooting and recovery systems. The answer I > usually get when I get an unexplained error is to run the program inside a > dbg > or with valgrind. I'm not convinced that this is a practical way to > troubleshoot serious problems (like kernel panics) and it requires a > certain > amount of foresight that a problem will occur. According to this logic, > the > only way that someone can produce useful reports and feedback (or even get > a > clue as to what happened) on the day-to-day crashes and bugs is to start > Linux > and all of its sub process inside valgrind and/or gdb. This is obviously > not > an intended use of these programs. > > This is what would make it easier (at least for me) to troubleshoot Linux > problems. If these features exist, please let me know so I can start using > them (they should probably be documented in the man pages too). > > 1) Logs need to have useful information. When I look at a client's Windows > box days after they report something going wrong, the logs tell me at what > time the problem happened, which process failed and what error it threw > just > before it blew. I can look those error codes up and (usually) fix the > problem > within an hour. When something dies on Linux, the log entry (assuming it > even > makes one) only tells me how many seconds into that particular boot the > problem occurred. I've never been able to go back a few days later and find > the > log entries related to a particular crash - maybe because they've been > purged. > I know that the Linux tradition is to identify processes only by ID but > surely > there must be a way that it can print a file or package name or anything > more > useful than memory addresses and registers so at least I know where to > start > pointing fingers. Several people have told me that it's pointless trying > to > debug a dump in the logs. What's the point of dumping it in the first > place if > nobody can read it? > > 2) I wish error logs had simple codes or messages (which have > documentation) > like Windows Stop errors so I can look them up and figure out why something > died. Often times I try to Google the whole error message and either get > directed to source code or totally irrelevant postings (since it seems that > many messages are reused for all kinds of problems). For example, > 'segfault' > gets thrown so much that it only tells you that the program crashed - > something I already know. > > 3) Logs need better organisation. I'm looking at the most recent dump and > each message is printed on its own line. The problem is that interspersed > in > those individual lines may be other entries from other events not related > to > the problem in question. When I look at a Windows log, each event is > entirely > contained in one entry. It doesn't make one entry for "Stop", another > entry > for the Stop number, another 4 entries for the parameters and more entries > for > whatever other information usually is in them - whilst having other entries > amid the list with what other things were doing at the time. I find Linux > logs > very frustrating to read for that reason since I don't know when an event > is > finished reporting or which items are relevant. > > 4) Logs need to focus on reporting on one thing and making sure it reports > that one thing well. Other than formatting, I can't see many differences > between syslog, dmesg and messages. Xorg.log is some help for > troubleshooting > misconfigured xorg.conf files (which are depreciated anyway) but not very > useful > if your X session burns down. kdm.log seems identical to Xorg.log except > for > a few KDE-specific entries. I had to uninstall my firewall because it kept > writing firewall entries to messages (and stdout) and I couldn't figure out > how > to get it to stop. Why isn't there one log that only deals with hardware > status and changes, another one that only deals with network status and > firewall logging, another one which only deals with dumps and crashes and > so > on? > > Maybe I just haven't found the right manual yet that has all of these > answers > so I'd appreciate any direction. > > With regards, > > Borden Rhodes > > > -- > To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org > with a subject of "unsubscribe". Trouble? Contact > listmas...@lists.debian.org > Archive: http://lists.debian.org/201008150200.52677.j...@bordenrhodes.com > > -- Wishing you the very best of everything, always!!! Kousik Maiti(কৌশিক মাইতি) Registered Linux User #474025 Registered Ubuntu User # 28654